SOURCEHTML http://www3.ncbi.nlm.nih.gov/htbin-post/Entrez/query?uid=327742&form=6&db=n&Dopt=g
LOCUS       HIVHXB2CG    9718 bp ss-RNA             VRL       14-JAN-1992
DEFINITION  Human immunodeficiency virus type 1 (HXB2), complete genome;
            HIV1/HTLV-III/LAV reference genome.
ACCESSION   K03455 M38432
NID         g327742
KEYWORDS    TAR protein; acquired immune deficiency syndrome; complete genome;
            env protein; gag protein; long terminal repeat (LTR); pol protein;
            polyprotein; proviral gene; reverse transcriptase; transactivator.
SOURCE      HTLV-III/LAV (isolate HXB2) proviral DNA.
  ORGANISM  Human immunodeficiency virus type 1
            Viruses; Retroid viruses; Retroviridae; Lentivirus; Primate
            lentivirus group.
REFERENCE   1  (sites)
  AUTHORS   Rosen,C.A., Sodroski,J.G. and Haseltine,W.A.
  TITLE     The location of cis-acting regulatory sequences in the human T cell
            lymphotropic virus type III (HTLV-III/LAV) long terminal repeat
  JOURNAL   Cell 41, 813-823 (1985)
  MEDLINE   85228232
REFERENCE   2  (bases 9577 to 9718; 493 to 674)
  AUTHORS   Wong-Staal,F., Gallo,R.C., Chang,N.T., Ghrayeb,J., Papas,T.S.,
            Lautenberger,J.A., Pearson,M.L., Petteway,S.R.Jr., Ivanoff,L.,
            Baumeister,K., Whitehorn,E.A., Rafalski,J.A., Doran,E.R.,
            Josephs,S.J., Starcich,B., Livak,K.J., Patarca,R., Haseltine,W.A.
            and Ratner,L.
  TITLE     Complete nucleotide sequence of the AIDS virus, HTLV-III
  JOURNAL   Nature 313, 277-284 (1985)
  MEDLINE   85111123
REFERENCE   3  (sites)
  AUTHORS   van Beveren,C.P., Coffin,J. and Hughes,S.
  TITLE     Appendix B: HTLV-3/LAV genome
  JOURNAL   (in) Weiss,R.L., Teich,N., Varmus,H. and Coffin,J. (Eds.);
            RNA TUMOR VIRUSES, SECOND EDITION, 2, Vol. 2: 1102-1123;
            Cold Spring Harbor Laboratory, Cold Spring Harbor (1985)
REFERENCE   4  (bases 1 to 653)
  AUTHORS   Starcich,B., Ratner,L., Josephs,S.F., Okamato,T., Gallo,R.C. and
            Wong-Staal,F.
  TITLE     Characterization of long terminal repeat sequences of HTLV-III
  JOURNAL   Science 227, 538-540 (1985)
  MEDLINE   85090465
REFERENCE   5  (sites)
  AUTHORS   Allan,J.S., Coligan,J.E., Barin,F., McLane,M.F., Sodroski,J.G.,
            Rosen,C.A., Haseltine,W.A., Lee,T.H. and Essex,M.
  TITLE     Major glycoprotein antigens that induce antibodies in AIDS patients
            are encoded by HTLV-III
  JOURNAL   Science 228, 1091-1094 (1985)
  MEDLINE   85192537
REFERENCE   6  (sites)
  AUTHORS   Arya,S.K., Guo,C., Josephs,S.F. and Wong-Staal,F.
  TITLE     Trans-activator gene of human T-lymphotropic virus type III
            (HTLV-III)
  JOURNAL   Science 229, 69-73 (1985)
  MEDLINE   85244626
REFERENCE   7  (sites)
  AUTHORS   Sodroski,J., Patarca,R., Rosen,C., Wong-Staal,F. and Haseltine,W.A.
  TITLE     Location of the trans-activating region on the genome of human
            T-cell lymphotropic virus type III
  JOURNAL   Science 229, 74-77 (1985)
  MEDLINE   85244627
REFERENCE   8  (sites)
  AUTHORS   Rabson,A.B., Daugherty,D.F., Venkatesan,S., Boulukos,K.E.,
            Benn,S.I., Folks,T.M., Feorino,P. and Martin,M.
  TITLE     Transcription of novel open reading frames of AIDS retrovirus
            during infection of lymphocytes
  JOURNAL   Science 229, 1388-1390 (1985)
  MEDLINE   85300515
REFERENCE   9  (sites)
  AUTHORS   Allan,J.S., Coligan,J.E., Lee,T.-H., McLane,M.F., Kanki,P.J.,
            Groopman,J.E. and Essex,M.
  TITLE     A new HTLV-III/LAV encoded antigen detected by antibodies from AIDS
            patients
  JOURNAL   Science 230, 810-813 (1985)
  MEDLINE   86044509
REFERENCE   10 (sites)
  AUTHORS   Dayton,A.I., Sodroski,J.G., Rosen,C.A., Goh,W.C. and Haseltine,W.A.
  TITLE     The trans-activator gene of the human T cell lymphotropic virus
            type III is required for replication
  JOURNAL   Cell 44, 941-947 (1986)
  MEDLINE   86161683
REFERENCE   11 (sites)
  AUTHORS   Starcich,B.R., Hahn,B.H., Shaw,G.M., McNeely,P.D., Modrow,S.,
            Wolf,H., Parks,E.S., Parks,W.P., Josephs,S.F., Gallo,R.C. and
            Wong-Staal,F.
  TITLE     Identification and characterization of conserved and variable
            regions in the envelope gene of HTLV-III/LAV, the retrovirus of
            AIDS
  JOURNAL   Cell 45, 637-648 (1986)
  MEDLINE   86218077
REFERENCE   12 (sites)
  AUTHORS   Feinberg,M.B., Jarret,R.F., Aldovini,A., Gallo,R.C. and
            Wong-Staal,F.
  TITLE     HTLV-III expression and production involve complex regulation at
            the levels of splicing and translation of viral RNA
  JOURNAL   Cell 46, 807-817 (1986)
  MEDLINE   87002448
REFERENCE   13 (sites)
  AUTHORS   Terwilliger,E., Sodroski,J.G., Rosen,C.A. and Haseltine,W.A.
  TITLE     Effects of mutations within the 3' orf open reading frame region of
            human T-cell lymphotropic virus type III (HTLV-III/LAV) on
            replication and cytopathogenecity
  JOURNAL   J. Virol. 60, 754-760 (1986)
  MEDLINE   87036943
REFERENCE   14 (sites)
  AUTHORS   Lightfoote,M.M., Coligan,J.E., Folks,T.M., Fauci,A.S., Martin,M.A.
            and Venkatesan,S.
  TITLE     Structural characterization of reverse transcriptase and
            endonuclease polypeptides of the acquired immunodeficiency syndrome
            retrovirus
  JOURNAL   J. Virol. 60, 771-775 (1986)
  MEDLINE   87036947
REFERENCE   15 (sites)
  AUTHORS   Rosen,C.A., Sodroski,J.G., Goh,W.C., Dayton,A.I., Lippke,J.A. and
            Haseltine,W.A.
  TITLE     Post-transcriptional regulation accounts for the trans-activation
            of the human T-lymphotropic virus type III
  JOURNAL   Nature 319, 555-559 (1986)
  MEDLINE   86118720
REFERENCE   16 (sites)
  AUTHORS   Sodroski,J., Goh,W.C., Rosen,C., Dayton,A.I., Terwilliger,E. and
            Haseltine,W.A.
  TITLE     A second post-transcriptional trans-activator gene required for
            HTLV-III replication
  JOURNAL   Nature 321, 412-417 (1986)
  MEDLINE   86230863
REFERENCE   17 (sites)
  AUTHORS   Arya,S.K. and Gallo,R.C.
  TITLE     Three novel genes of human T-lymphotropic virus type III: Immune
            reactivity of their products with sera from acquired immune
            deficiency syndrome patients
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 83, 2209-2213 (1986)
  MEDLINE   86177573
REFERENCE   18 (sites)
  AUTHORS   Willey,R., Rutledge,R.A., Dias,S., Folks,T., Theodore,T.,
            Buckler,C.E. and Martin,M.A.
  TITLE     Identification of conserved and divergent domains within the
            envelope gene of the acquired immunodeficiency syndrome virus
  JOURNAL   Proc. Natl. Acad. Sci. U.S.A. 83, 5038-5042 (1986)
  MEDLINE   86259728
REFERENCE   19 (sites)
  AUTHORS   di Marzo Veronese,F., Copeland,T.D., DeVico,A.L., Rahman,R.,
            Oroszlan,S., Gallo,R.C. and Sarngadharan,M.G.
  TITLE     Characterization of highly immunogenic p66/p51 as the reverse
            transcriptase of HTLV-III/LAV
  JOURNAL   Science 231, 1289-1291 (1986)
  MEDLINE   86122937
REFERENCE   20 (sites)
  AUTHORS   Lee,T.-H., Coligan,J.E., Allan,J.S., McLane,M.F., Groopman,J.E. and
            Essex,M.
  TITLE     A new HTLV-III/LAV protein encoded by a gene found in cytopathic
            retroviruses
  JOURNAL   Science 231, 1546-1549 (1986)
  MEDLINE   86151661
REFERENCE   21 (sites)
  AUTHORS   Sodroski,J., Goh,W.C., Rosen,C., Tartar,A., Portetelle,D., Burny,A.
            and Haseltine,W.A.
  TITLE     Replicative and cytopathic potential of HTLV-III/LAV with sor gene
            deletions
  JOURNAL   Science 231, 1549-1553 (1986)
  MEDLINE   86151662
REFERENCE   22 (sites)
  AUTHORS   Kan,N.C., Franchini,G., Wong-Staal,F., DuBois,G.C., Robey,W.G.,
            Lautenberger,J.A. and Papas,T.S.
  TITLE     Identification of HTLV-III/LAV sor gene product and detection of
            antibodies in human sera
  JOURNAL   Science 231, 1553-1555 (1986)
  MEDLINE   86151663
REFERENCE   23 (sites)
  AUTHORS   Kramer,R.A., Schaber,M.D., Skalka,A.M., Ganguly,K., Wong-Staal,F.
            and Reddy,P.E.
  TITLE     HTLV-III gag protein is processed in yeast cells by the virus
            pol-protease
  JOURNAL   Science 231, 1580-1584 (1986)
  MEDLINE   86151671
REFERENCE   24 (sites)
  AUTHORS   Jones,K.A., Kadonaga,J.T., Luciw,P.A. and Tjian,R.
  TITLE     Activation of the AIDS retrovirus promoter by the cellular
            transcription factor, Sp1
  JOURNAL   Science 232, 755-759 (1986)
  MEDLINE   86179897
REFERENCE   25 (bases 8761 to 9060)
  AUTHORS   Fisher,A.G., Ratner,L., Mitsuya,H., Marselle,L.M., Harper,M.E.,
            Broder,S., Gallo,R.C. and Wong-Staal,F.
  TITLE     Infectious mutants of HTLV-III with changes in the 3' region and
            markedly reduced cytopathic effects
  JOURNAL   Science 233, 655-659 (1986)
  MEDLINE   86261824
REFERENCE   26 (sites)
  AUTHORS   Wright,C.M., Felber,B.K., Paskalis,H. and Pavlakis,G.N.
  TITLE     Expression and characterization of the trans-activator of
            HTLV-III/LAV virus
  JOURNAL   Science 234, 988-992 (1986)
  MEDLINE   87042788
REFERENCE   27 (bases )
  AUTHORS   Ratner,L.
  JOURNAL   Unpublished (1987)
REFERENCE   28 (sites)
  AUTHORS   Wong-Staal,F., Chanda,P.K. and Ghrayeb,J.
  TITLE     Human immunodeficiency virus: the eighth gene
  JOURNAL   AIDS Res. Hum. Retroviruses 3, 33-39 (1987)
  MEDLINE   87299194
REFERENCE   29 (sites)
  AUTHORS   Patarca,R., Heath,C., Goldenberg,G.J., Rosen,C.A., Sodroski,J.G.,
            Haseltine,W.A. and Hansen,U.M.
  TITLE     Transcription directed by the HIV long terminal repeat in vitro
  JOURNAL   AIDS Res. Hum. Retroviruses 3, 41-55 (1987)
  MEDLINE   87299195
REFERENCE   30 (bases 1 to 9635; 1 to 9635)
  AUTHORS   Ratner,L., Fisher,A., Jagodzinski,L.L., Mitsuya,H., Liou,R.-S.,
            Gallo,R.C. and Wong-Staal,F.
  TITLE     Complete nucleotide sequences of functional clones of the AIDS
            virus
  JOURNAL   AIDS Res. Hum. Retroviruses 3, 57-69 (1987)
  MEDLINE   87299196
REFERENCE   31 (sites)
  AUTHORS   Muesing,M.A., Smith,D.H. and Capon,D.J.
  TITLE     Regulation of mRNA accumulation by a human immunodeficiency virus
            trans-activator protein
  JOURNAL   Cell 48, 691-701 (1987)
  MEDLINE   87131081
REFERENCE   32 (sites)
  AUTHORS   Modrow,S., Hahn,B.H., Shaw,G.M., Gallo,R.C., Wong-Staal,F. and
            Wolf,H.
  TITLE     Computer-assisted analysis of envelope protein sequences of seven
            human immunodeficiency virus isolates: Prediction of antigenic
            epitopes in conserved and variable regions
  JOURNAL   J. Virol. 61, 570-578 (1987)
  MEDLINE   87112954
REFERENCE   33 (sites)
  AUTHORS   Goh,W.C., Sodroski,J.G., Rosen,C.A. and Haseltine,W.A.
  TITLE     Expression of the art gene protein of human T-lymphotropic virus
            type III (HTLV-III/LAV) in bacteria
  JOURNAL   J. Virol. 61, 633-637 (1987)
  MEDLINE   87112968
REFERENCE   34 (sites)
  AUTHORS   Nabel,G. and Baltimore,D.
  TITLE     An inducible transcription factor activates expression of human
            immunodeficiency virus in T cells
  JOURNAL   Nature 326, 711-713 (1987)
  MEDLINE   87173065
REFERENCE   35 (sites)
  AUTHORS   Fisher,A.G., Ensoli,B., Ivanoff,L., Chamberlain,M., Petteway,S.,
            Ratner,L., Gallo,R.C. and Wong-Staal,F.
  TITLE     The sor gene of hiv-1 is required for efficient virus transmission
            in vitro
  JOURNAL   Science 237, 888-893 (1987)
  MEDLINE   87292118
REFERENCE   36 (sites)
  AUTHORS   Ido,E., Han,H.-p., Kezdy,F.J. and Tang,J.
  TITLE     Kinetic studies of human immunodeficiency virus type 1 protease and
            its active-site hydrogen bond mutant A28S
  JOURNAL   J. Biol. Chem. 266, 24359-24366 (1991)
  MEDLINE   92105089
COMMENT     [6]  sites; tat mRNA and other transcript boundaries. [7]  sites;
            tat mRNA.
            [8]  sites; mRNA splice sites.
            [9]  sites; 27K antigen cds.
            [5]  sites; gp160 and gp120 coding sequences.
            [1]  sites; regulatory sequences in the LTR.
            [(in) Weiss,R., Teich,N., Varmus,H. and Coffin,J. (Eds.);RNA Tumor
            Viruses, Secon]  review; bases 1 to 9718.
            [15]  sites; trans-activator function and TAR sequence. [19]
            sites; pol coding sequence.
            [22]  sites; 23K sor gene product.
            [23]  sites; pol NH2-terminal region.
            [20]  sites; sor 23K protein.
            [21]  sites; sor 23K protein.
            [24]  sites; Sp1 binding sites in the promoter region. [17]  sites;
            acceptor and donor splice sites for tat and 27K. [10]  sites;
            deletion mutants in the tat gene.
            [18]  sites; env gene conserved/varable regions; separate entries.
            [16]  sites; trs cds boundaries.
            [12]  sites; trs cds boundaries.
            [11]  sites; env gene conserved/variable regions; separate entries.
            [26]  sites; tar or transactivator target.
            [13]  sites; 3' orf mutations.
            [14]  sites; pol p34 terminus.
            [31]  sites; promoter, TAR, tat-III mutants.
            [32]  sites; envelope protein epitopes.
            [33]  sites; trs/art protein.
            [34]  sites; inducible enhancer element.
            [27]  revises [30].
            [29]  sites; long terminal repeat.
            [28]  sites; R orf.
            [35]  sites; sor.
            Sequence for [25] kindly provided in computer-readable form by
            L.Ratner, 19-AUG-1986.
            The HXB2 sequence is being used as a reference genome for all the
            HIV entries because it has been derived from a demonstrably
            infectious clone.  Hence not all of the 'sites' references above
            were concerned with this isolate.
FEATURES             Location/Qualifiers
     source          1..9718
                     /organism="Human immunodeficiency virus type 1"
     LTR             1..634
                     /note="5' LTR"
     repeat_region   454..551
                     /note="R repeat 5' copy"
     prim_transcript 455..9635
                     /note="tat, trs, 27K subgenomic mRNA"
     mRNA            455..9635
                     /note="HXB2 genomic mRNA"
     intron          743..5776
                     /note="tat,trs, 27K mRNA intron 1"
     CDS             789..2291
                     /note="gag polyprotein"
                     /codon_start=1
                     /db_xref="PID:g327745"
                     /translation="MGARASVLSGGELDRWEKIRLRPGGKKKYKLKHIVWASRELERF
                     AVNPGLLETSEGCRQILGQLQPSLQTGSEELRSLYNTVATLYCVHQRIEIKDTKEALD
                     KIEEEQNKSKKKAQQAAADTGHSNQVSQNYPIVQNIQGQMVHQAISPRTLNAWVKVVE
                     EKAFSPEVIPMFSALSEGATPQDLNTMLNTVGGHQAAMQMLKETINEEAAEWDRVHPV
                     HAGPIAPGQMREPRGSDIAGTTSTLQEQIGWMTNNPPIPVGEIYKRWIILGLNKIVRM
                     YSPTSILDIRQGPKEPFRDYVDRFYKTLRAEQASQEVKNWMTETLLVQNANPDCKTIL
                     KALGPAATLEEMMTACQGVGGPGHKARVLAEAMSQVTNSATIMMQRGNFRNQRKIVKC
                     FNCGKEGHTARNCRAPRKKGCWKCGKEGHQMKDCTERQANFLGKIWPSYKGRPGNFLQ
                     SRPEPTAPPEESFRSGVETTTPPQKQEPIDKELYPLTSLRSLFGNDPSSQ"
     CDS             2357..5095
                     /partial
                     /note="pol polyprotein (NH2-terminus uncertain)"
                     /codon_start=1
                     /db_xref="PID:g327746"
                     /translation="MSLPGRWKPKMIGGIGGFIKVRQYDQILIEICGHKAIGTVLVGP
                     TPVNIIGRNLLTQIGCTLNFPISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEIC
                     TEMEKEGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPH
                     PAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGWK
                     GSPAIFQSSMTKILEPFRKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRW
                     GLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQKLVGKLNWASQI
                     YPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAENREILKEPVHGVYYDPSKDLIAE
                     IQKQGQGQWTYQIYQEPFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKT
                     PKFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDG
                     AANRETKLGKAGYVTNRGRQKVVTLTDTTNQKTELQAIYLALQDSGLEVNIVTDSQYA
                     LGIIQAQPDQSESELVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGIRKVL
                     FLDGIDKAQDEHEKYHSNWRAMASDFNLPPVVAKEIVASCDKCQLKGEAMHGQVDCSP
                     GIWQLDCTHLEGKVILVAVHVASGYIEAEVIPAETGQETAYFLLKLAGRWPVKTIHTD
                     NGSNFTGATVRAACWWAGIKQEFGIPYNPQSQGVVESMNKELKKIIGQVRDQAEHLKT
                     AVQMAVFIHNFKRKGGIGGYSAGERIVDIIATDIQTKELQKQITKIQNFRVYYRDSRN
                     SLWKGPAKLLWKGEGAVVIQDNSDIKVVPRRKAKIIRDYGKQMAGDDCVASRQDED"
     CDS             5040..5618
                     /note="sor 23K protein"
                     /codon_start=1
                     /db_xref="PID:g327747"
                     /translation="MENRWQVMIVWQVDRMRIRTWKSLVKHHMYVSGKARGWFYRHHY
                     ESPHPRISSEVHIPLGDARLVITTYWGLHTGERDWHLGQGVSIEWRKKRYSTQVDPEL
                     ADQLIHLYYFDCFSDSAIRKALLGHIVSPRCEYQAGHNKVGSLQYLALAALITPKKIK
                     PPLPSVTKLTEDRWNKPQKTKGHRGSHTMNGH"
     CDS             5558..5794
                     /note="R (ORF) protein"
                     /codon_start=1
                     /db_xref="PID:g327748"
                     /translation="MEQAPEDQGPQREPHNEWTLELLEELKNEAVRHFPRIWLHGLGQ
                     HIYETYGDTWAGVEAIIRILQQLLFIHFQNWVST"
     CDS             join(5830..6044,8378..8423)
                     /note="tat protein"
                     /codon_start=1
                     /db_xref="PID:g327743"
                     /translation="MEPVDPRLEPWKHPGSQPKTACTNCYCKKCCFHCQVCFITKALG
                     ISYGRKKRRQRRRAHQNSQTHQASLSKQPTSQSRGDPTGPKE"
     exon            <5830..6044
                     /note="tat protein,  (first expressed exon)"
                     /number=2
     CDS             join(5969..6044,8378..8652)
                     /note="trs protein"
                     /codon_start=1
                     /db_xref="PID:g327744"
                     /translation="MAGRSGDSDEELIRTVRLIKLLYQSNPPPNPEGTRQARRNRRRR
                     WRERQRQIHSISERILGTYLGRSAEPVPLQLPPLERLTLDCNEDCGTSGTQGVGSPQI
                     LVESPTVLESGTKE"
     exon            <5969..6044
                     /note="trs protein,  (first expressed exon)"
                     /number=2
     intron          6045..8377
                     /note="tat, trs intron 2"
     intron          6045..8377
                     /note="27K mRNA intron 2"
     intron          6045..8377
                     /note="trs intron 2"
     intron          6045..8377
                     /note="tat intron 1"
     CDS             6224..8794
                     /note="envelope polyprotein"
                     /codon_start=1
                     /db_xref="PID:g327749"
                     /translation="MRVKEKYQHLWRWGWRWGTMLLGMLMICSATEKLWVTVYYGVPV
                     WKEATTTLFCASDAKAYDTEVHNVWATHACVPTDPNPQEVVLVNVTENFDMWKNDMVE
                     QMHEDIISLWDQSLKPCVKLTPLCVSLKCTDLKNDTNTNSSSGRMIMEKGEIKNCSFN
                     ISTSIRGKVQKEYAFFYKLDIIPIDNDTTSYSLTSCNTSVITQACPKVSFEPIPIHYC
                     APAGFAILKCNNKTFNGTGPCTNVSTVQCTHGIRPVVSTQLLLNGSLAEEEVVIRSVN
                     FTDNAKTIIVQLNTSVEINCTRPNNNTRKRIRIQRGPGRAFVTIGKIGNMRQAHCNIS
                     RAKWNNTLKQIDSKLREQFGNNKTIIFKQSSGGDPEIVTHSFNCGGEFFYCNSTQLFN
                     STWFNSTWSTEGSNNTEGSDTITLPCRIKQIINMWQKVGKAMYAPPISGQIRCSSNIT
                     GLLLTRDGGNSNNESEIFRLGGGDMRDNWRSELYKYKVVKIEPLGVAPTKAKRRVVQR
                     EKRAVGIGALFLGFLGAAGSTMGAASMTLTVQARQLLSGIVQQQNNLLRAIEAQQHLL
                     QLTVWGIKQLQARILAVERYLKDQQLLGIWGCSGKLICTTAVPWNASWSNKSLEQIWN
                     HTTWMEWDREINNYTSLIHSLIEESQNQQEKNEQELLELDKWASLWNWFNITNWLWYI
                     KLFIMIVGGLVGLRIVFAVLSIVNRVRQGYSPLSFQTHLPIPRGPDRPEGIEEEGGER
                     DRDRSIRLVNGSLALIWDDLRSLCLFSYHRLRDLLLIVTRIVELLGRRGWEALKYWWN
                     LLQYWSQELKNSAVSLLNATAIAVAEGTDRVIEVVQGACRAIRHIPRRIRQGLERILL
                     "
     exon            8378..>8652
                     /note="trs protein"
                     /number=3
     exon            8378..>8423
                     /note="tat protein"
                     /number=3
     CDS             8796..9167
                     /note="27K protein (premature termination)"
                     /codon_start=1
                     /db_xref="PID:g327750"
                     /translation="MGGKWSKSSVIGWLTVRERMRRAEPAADGVGAASRDLEKHGAIT
                     SSNTAATNAACAWLEAQEEEEVGFPVTPQVPLRPMTYKAAVDLSHFLKEKGGLEGLIH
                     SQRRQDILDLWIYHTQGYFPD"
     LTR             9085..9718
                     /note="3' LTR"
     repeat_region   9539..9635
                     /note="R repeat 3' copy"
     polyA_signal    9611..9616
                     /note="HXB2 mRNA polyadenyation signal"
BASE COUNT     3411 a   1773 c   2370 g   2164 t
ORIGIN      435 bp upstream of PvuII site; 5' end of proviral genome.
        1 tggaagggct aattcactcc caacgaagac aagatatcct tgatctgtgg atctaccaca
       61 cacaaggcta cttccctgat tagcagaact acacaccagg gccagggatc agatatccac
      121 tgacctttgg atggtgctac aagctagtac cagttgagcc agagaagtta gaagaagcca
      181 acaaaggaga gaacaccagc ttgttacacc ctgtgagcct gcatggaatg gatgacccgg
      241 agagagaagt gttagagtgg aggtttgaca gccgcctagc atttcatcac atggcccgag
      301 agctgcatcc ggagtacttc aagaactgct gacatcgagc ttgctacaag ggactttccg
      361 ctggggactt tccagggagg cgtggcctgg gcgggactgg ggagtggcga gccctcagat
      421 cctgcatata agcagctgct ttttgcctgt actgggtctc tctggttaga ccagatctga
      481 gcctgggagc tctctggcta actagggaac ccactgctta agcctcaata aagcttgcct
      541 tgagtgcttc aagtagtgtg tgcccgtctg ttgtgtgact ctggtaacta gagatccctc
      601 agaccctttt agtcagtgtg gaaaatctct agcagtggcg cccgaacagg gacctgaaag
      661 cgaaagggaa accagagctc tctcgacgca ggactcggct tgctgaagcg cccgcacggc
      721 aagaggcgag gggcggcgac tggtgagtac gccaaaaatt ttgactagcg gaggctagaa
      781 ggagagagat gggtgcgaga gcgtcagtat taagcggggg agaattagat cgatgggaaa
      841 aaattcggtt aaggccaggg ggaaagaaaa aatataaatt aaaacatata gtatgggcaa
      901 gcagggagct agaacgattc gcagttaatc ctggcctgtt agaaacatca gaaggctgta
      961 gacaaatact gggacagcta caaccatccc ttcagacagg atcagaagaa cttagatcat
     1021 tatataatac agtagcaacc ctctattgtg tgcatcaaag gatagagata aaagacacca
     1081 aggaagcttt agacaagata gaggaagagc aaaacaaaag taagaaaaaa gcacagcaag
     1141 cagcagctga cacaggacac agcaatcagg tcagccaaaa ttaccctata gtgcagaaca
     1201 tccaggggca aatggtacat caggccatat cacctagaac tttaaatgca tgggtaaaag
     1261 tagtagaaga gaaggctttc agcccagaag tgatacccat gttttcagca ttatcagaag
     1321 gagccacccc acaagattta aacaccatgc taaacacagt ggggggacat caagcagcca
     1381 tgcaaatgtt aaaagagacc atcaatgagg aagctgcaga atgggataga gtgcatccag
     1441 tgcatgcagg gcctattgca ccaggccaga tgagagaacc aaggggaagt gacatagcag
     1501 gaactactag tacccttcag gaacaaatag gatggatgac aaataatcca cctatcccag
     1561 taggagaaat ttataaaaga tggataatcc tgggattaaa taaaatagta agaatgtata
     1621 gccctaccag cattctggac ataagacaag gaccaaagga accctttaga gactatgtag
     1681 accggttcta taaaactcta agagccgagc aagcttcaca ggaggtaaaa aattggatga
     1741 cagaaacctt gttggtccaa aatgcgaacc cagattgtaa gactatttta aaagcattgg
     1801 gaccagcggc tacactagaa gaaatgatga cagcatgtca gggagtagga ggacccggcc
     1861 ataaggcaag agttttggct gaagcaatga gccaagtaac aaattcagct accataatga
     1921 tgcagagagg caattttagg aaccaaagaa agattgttaa gtgtttcaat tgtggcaaag
     1981 aagggcacac agccagaaat tgcagggccc ctaggaaaaa gggctgttgg aaatgtggaa
     2041 aggaaggaca ccaaatgaaa gattgtactg agagacaggc taatttttta gggaagatct
     2101 ggccttccta caagggaagg ccagggaatt ttcttcagag cagaccagag ccaacagccc
     2161 caccagaaga gagcttcagg tctggggtag agacaacaac tccccctcag aagcaggagc
     2221 cgatagacaa ggaactgtat cctttaactt ccctcaggtc actctttggc aacgacccct
     2281 cgtcacaata aagatagggg ggcaactaaa ggaagctcta ttagatacag gagcagatga
     2341 tacagtatta gaagaaatga gtttgccagg aagatggaaa ccaaaaatga tagggggaat
     2401 tggaggtttt atcaaagtaa gacagtatga tcagatactc atagaaatct gtggacataa
     2461 agctataggt acagtattag taggacctac acctgtcaac ataattggaa gaaatctgtt
     2521 gactcagatt ggttgcactt taaattttcc cattagccct attgagactg taccagtaaa
     2581 attaaagcca ggaatggatg gcccaaaagt taaacaatgg ccattgacag aagaaaaaat
     2641 aaaagcatta gtagaaattt gtacagagat ggaaaaggaa gggaaaattt caaaaattgg
     2701 gcctgaaaat ccatacaata ctccagtatt tgccataaag aaaaaagaca gtactaaatg
     2761 gagaaaatta gtagatttca gagaacttaa taagagaact caagacttct gggaagttca
     2821 attaggaata ccacatcccg cagggttaaa aaagaaaaaa tcagtaacag tactggatgt
     2881 gggtgatgca tatttttcag ttcccttaga tgaagacttc aggaagtata ctgcatttac
     2941 catacctagt ataaacaatg agacaccagg gattagatat cagtacaatg tgcttccaca
     3001 gggatggaaa ggatcaccag caatattcca aagtagcatg acaaaaatct tagagccttt
     3061 tagaaaacaa aatccagaca tagttatcta tcaatacatg gatgatttgt atgtaggatc
     3121 tgacttagaa atagggcagc atagaacaaa aatagaggag ctgagacaac atctgttgag
     3181 gtggggactt accacaccag acaaaaaaca tcagaaagaa cctccattcc tttggatggg
     3241 ttatgaactc catcctgata aatggacagt acagcctata gtgctgccag aaaaagacag
     3301 ctggactgtc aatgacatac agaagttagt ggggaaattg aattgggcaa gtcagattta
     3361 cccagggatt aaagtaaggc aattatgtaa actccttaga ggaaccaaag cactaacaga
     3421 agtaatacca ctaacagaag aagcagagct agaactggca gaaaacagag agattctaaa
     3481 agaaccagta catggagtgt attatgaccc atcaaaagac ttaatagcag aaatacagaa
     3541 gcaggggcaa ggccaatgga catatcaaat ttatcaagag ccatttaaaa atctgaaaac
     3601 aggaaaatat gcaagaatga ggggtgccca cactaatgat gtaaaacaat taacagaggc
     3661 agtgcaaaaa ataaccacag aaagcatagt aatatgggga aagactccta aatttaaact
     3721 gcccatacaa aaggaaacat gggaaacatg gtggacagag tattggcaag ccacctggat
     3781 tcctgagtgg gagtttgtta atacccctcc cttagtgaaa ttatggtacc agttagagaa
     3841 agaacccata gtaggagcag aaaccttcta tgtagatggg gcagctaaca gggagactaa
     3901 attaggaaaa gcaggatatg ttactaatag aggaagacaa aaagttgtca ccctaactga
     3961 cacaacaaat cagaagactg agttacaagc aatttatcta gctttgcagg attcgggatt
     4021 agaagtaaac atagtaacag actcacaata tgcattagga atcattcaag cacaaccaga
     4081 tcaaagtgaa tcagagttag tcaatcaaat aatagagcag ttaataaaaa aggaaaaggt
     4141 ctatctggca tgggtaccag cacacaaagg aattggagga aatgaacaag tagataaatt
     4201 agtcagtgct ggaatcagga aagtactatt tttagatgga atagataagg cccaagatga
     4261 acatgagaaa tatcacagta attggagagc aatggctagt gattttaacc tgccacctgt
     4321 agtagcaaaa gaaatagtag ccagctgtga taaatgtcag ctaaaaggag aagccatgca
     4381 tggacaagta gactgtagtc caggaatatg gcaactagat tgtacacatt tagaaggaaa
     4441 agttatcctg gtagcagttc atgtagccag tggatatata gaagcagaag ttattccagc
     4501 agaaacaggg caggaaacag catattttct tttaaaatta gcaggaagat ggccagtaaa
     4561 aacaatacat actgacaatg gcagcaattt caccggtgct acggttaggg ccgcctgttg
     4621 gtgggcggga atcaagcagg aatttggaat tccctacaat ccccaaagtc aaggagtagt
     4681 agaatctatg aataaagaat taaagaaaat tataggacag gtaagagatc aggctgaaca
     4741 tcttaagaca gcagtacaaa tggcagtatt catccacaat tttaaaagaa aaggggggat
     4801 tggggggtac agtgcagggg aaagaatagt agacataata gcaacagaca tacaaactaa
     4861 agaattacaa aaacaaatta caaaaattca aaattttcgg gtttattaca gggacagcag
     4921 aaattcactt tggaaaggac cagcaaagct cctctggaaa ggtgaagggg cagtagtaat
     4981 acaagataat agtgacataa aagtagtgcc aagaagaaaa gcaaagatca ttagggatta
     5041 tggaaaacag atggcaggtg atgattgtgt ggcaagtaga caggatgagg attagaacat
     5101 ggaaaagttt agtaaaacac catatgtatg tttcagggaa agctagggga tggttttata
     5161 gacatcacta tgaaagccct catccaagaa taagttcaga agtacacatc ccactagggg
     5221 atgctagatt ggtaataaca acatattggg gtctgcatac aggagaaaga gactggcatt
     5281 tgggtcaggg agtctccata gaatggagga aaaagagata tagcacacaa gtagaccctg
     5341 aactagcaga ccaactaatt catctgtatt actttgactg tttttcagac tctgctataa
     5401 gaaaggcctt attaggacac atagttagcc ctaggtgtga atatcaagca ggacataaca
     5461 aggtaggatc tctacaatac ttggcactag cagcattaat aacaccaaaa aagataaagc
     5521 cacctttgcc tagtgttacg aaactgacag aggatagatg gaacaagccc cagaagacca
     5581 agggccacag agggagccac acaatgaatg gacactagag cttttagagg agcttaagaa
     5641 tgaagctgtt agacattttc ctaggatttg gctccatggc ttagggcaac atatctatga
     5701 aacttatggg gatacttggg caggagtgga agccataata agaattctgc aacaactgct
     5761 gtttatccat tttcagaatt gggtgtcgac atagcagaat aggcgttact cgacagagga
     5821 gagcaagaaa tggagccagt agatcctaga ctagagccct ggaagcatcc aggaagtcag
     5881 cctaaaactg cttgtaccaa ttgctattgt aaaaagtgtt gctttcattg ccaagtttgt
     5941 ttcataacaa aagccttagg catctcctat ggcaggaaga agcggagaca gcgacgaaga
     6001 gctcatcaga acagtcagac tcatcaagct tctctatcaa agcagtaagt agtacatgta
     6061 acgcaaccta taccaatagt agcaatagta gcattagtag tagcaataat aatagcaata
     6121 gttgtgtggt ccatagtaat catagaatat aggaaaatat taagacaaag aaaaatagac
     6181 aggttaattg atagactaat agaaagagca gaagacagtg gcaatgagag tgaaggagaa
     6241 atatcagcac ttgtggagat gggggtggag atggggcacc atgctccttg ggatgttgat
     6301 gatctgtagt gctacagaaa aattgtgggt cacagtctat tatggggtac ctgtgtggaa
     6361 ggaagcaacc accactctat tttgtgcatc agatgctaaa gcatatgata cagaggtaca
     6421 taatgtttgg gccacacatg cctgtgtacc cacagacccc aacccacaag aagtagtatt
     6481 ggtaaatgtg acagaaaatt ttgacatgtg gaaaaatgac atggtagaac agatgcatga
     6541 ggatataatc agtttatggg atcaaagcct aaagccatgt gtaaaattaa ccccactctg
     6601 tgttagttta aagtgcactg atttgaagaa tgatactaat accaatagta gtagcgggag
     6661 aatgataatg gagaaaggag agataaaaaa ctgctctttc aatatcagca caagcataag
     6721 aggtaaggtg cagaaagaat atgcattttt ttataaactt gatataatac caatagataa
     6781 tgatactacc agctatagct tgacaagttg taacacctca gtcattacac aggcctgtcc
     6841 aaaggtatcc tttgagccaa ttcccataca ttattgtgcc ccggctggtt ttgcgattct
     6901 aaaatgtaat aataagacgt tcaatggaac aggaccatgt acaaatgtca gcacagtaca
     6961 atgtacacat ggaattaggc cagtagtatc aactcaactg ctgttaaatg gcagtctagc
     7021 agaagaagag gtagtaatta gatctgtcaa tttcacggac aatgctaaaa ccataatagt
     7081 acagctgaac acatctgtag aaattaattg tacaagaccc aacaacaata caagaaaaag
     7141 aatccgtatc cagagaggac cagggagagc atttgttaca ataggaaaaa taggaaatat
     7201 gagacaagca cattgtaaca ttagtagagc aaaatggaat aacactttaa aacagataga
     7261 tagcaaatta agagaacaat tcggaaataa taaaacaata atctttaagc aatcctcagg
     7321 aggggaccca gaaattgtaa cgcacagttt taattgtgga ggggaatttt tctactgtaa
     7381 ttcaacacaa ctgtttaata gtacttggtt taatagtact tggagtactg aagggtcaaa
     7441 taacactgaa ggaagtgaca caatcaccct cccatgcaga ataaaacaaa ttataaacat
     7501 gtggcagaaa gtaggaaaag caatgtatgc ccctcccatc agtggacaaa ttagatgttc
     7561 atcaaatatt acagggctgc tattaacaag agatggtggt aatagcaaca atgagtccga
     7621 gatcttcaga cttggaggag gagatatgag ggacaattgg agaagtgaat tatataaata
     7681 taaagtagta aaaattgaac cattaggagt agcacccacc aaggcaaaga gaagagtggt
     7741 gcagagagaa aaaagagcag tgggaatagg agctttgttc cttgggttct tgggagcagc
     7801 aggaagcact atgggcgcag cctcaatgac gctgacggta caggccagac aattattgtc
     7861 tggtatagtg cagcagcaga acaatttgct gagggctatt gaggcgcaac agcatctgtt
     7921 gcaactcaca gtctggggca tcaagcagct ccaagcaaga atcctagctg tggaaagata
     7981 cctaaaggat caacagctcc tagggatttg gggttgctct ggaaaactca tttgcaccac
     8041 tgctgtgcct tggaatgcta gttggagtaa taaatctctg gaacagatct ggaatcacac
     8101 gacctggatg gagtgggaca gagaaattaa caattacaca agcttaatac actccttaat
     8161 tgaagaatcg caaaaccagc aagaaaagaa tgaacaagaa ttattggaat tagataaatg
     8221 ggcaagtttg tggaattggt ttaacataac aaattggctg tggtatataa aattattcat
     8281 aatgatagta ggaggcttgg taggtttaag aatagttttt gctgtacttt ctatagtgaa
     8341 tagagttagg cagggatatt caccattatc gtttcagacc cacctcccaa tcccgagggg
     8401 acccgacagg cccgaaggaa tagaagaaga aggtggagag agagacagag acagatccat
     8461 tcgattagtg aacggatcct tggcacttat ctgggacgat ctgcggagcc tgtgcctctt
     8521 cagctaccac cgcttgagag acttactctt gattgtaacg aggattgtgg aacttctggg
     8581 acgcaggggg tgggaagccc tcaaatattg gtggaatctc ctacagtatt ggagtcagga
     8641 actaaagaat agtgctgtta gcttgctcaa tgccacagcc atagcagtag ctgaggggac
     8701 agatagggtt atagaagtag tacaaggagc ttgtagagct attcgccaca tacctagaag
     8761 aataagacag ggcttggaaa ggattttgct ataagatggg tggcaagtgg tcaaaaagta
     8821 gtgtgattgg atggcttact gtaagggaaa gaatgagacg agctgagcca gcagcagatg
     8881 gggtgggagc agcatctcga gacctggaaa aacatggagc aatcacaagt agcaacacag
     8941 cagctaccaa tgctgcttgt gcctggctag aagcacaaga ggaggaggag gtgggttttc
     9001 cagtcacacc tcaggtacct ttaagaccaa tgacttacaa ggcagctgta gatcttagcc
     9061 actttttaaa agaaaagggg ggactggaag ggctaattca ctcccaaaga agacaagata
     9121 tccttgatct gtggatctac cacacacaag gctacttccc tgattgacag aactacacac
     9181 cagggccagg ggtcagatat ccactgacct ttggatggtg ctacaagcta gtaccagttg
     9241 agccagataa gatagaagag gccaataaag gagagaacac cagcttgtta caccctgtga
     9301 gcctgcatgg gatggatgac ccggagagag aagtgttaga gtggaggttt gacagccgcc
     9361 tagcatttca tcacgtggcc cgagagctgc atccggagta cttcaagaac tgctgacatc
     9421 gagcttgcta caagggactt tccgctgggg actttccagg gaggcgtggc ctgggcggga
     9481 ctggggagtg gcgagccctc agatcctgca tataagcagc tgctttttgc ctgtactggg
     9541 tctctctggt tagaccagat ctgagcctgg gagctctctg gctaactagg gaacccactg
     9601 cttaagcctc aataaagctt gccttgagtg cttcaagtag tgtgtgcccg tctgttgtgt
     9661 gactctggta actagagatc cctcagaccc ttttagtcag tgtggaaaat ctctagca
//