                         SEQUENCE LISTING

<110>  The Trustees of the University of Pennsylvania
 
<120>  COMPOSITION FOR TREATMENT OF CRIGLER-NAJJAR SYNDROME

<130>  UPN-16-7685PCT

<150>  US 62/266,969
<151>  2015-12-14

<150>  US 62/348,029
<151>  2016-06-09

<160>  18    

<170>  PatentIn version 3.5

<210>  1
<211>  1599
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  Engineered UGT1A1 sequence U001

<400>  1
atggccgtgg agagccaggg ggggcggccc ctggtgctgg ggctgctgct gtgcgtgctg       60

gggcccgtgg tgagccacgc cgggaagatc ctgctgatcc ccgtggacgg gagccactgg      120

ctgagcatgc tgggggccat ccagcagctg cagcagcggg ggcacgagat cgtggtgctg      180

gcccccgacg ccagcctgta catccgggac ggggccttct acaccctgaa gacctacccc      240

gtgcccttcc agcgggagga cgtgaaggag agcttcgtga gcctggggca caacgtgttc      300

gagaacgaca gcttcctgca gcgggtgatc aagacctaca agaagatcaa gaaggacagc      360

gccatgctgc tgagcgggtg cagccacctg ctgcacaaca aggagctgat ggccagcctg      420

gccgagagca gcttcgacgt gatgctgacc gaccccttcc tgccctgcag ccccatcgtg      480

gcccagtacc tgagcctgcc caccgtgttc ttcctgcacg ccctgccctg cagcctggag      540

ttcgaggcca cccagtgccc caaccccttc agctacgtgc cccggcccct gagcagccac      600

agcgaccaca tgaccttcct gcagcgggtg aagaacatgc tgatcgcctt cagccagaac      660

ttcctgtgcg acgtggtgta cagcccctac gccaccctgg ccagcgagtt cctgcagcgg      720

gaggtgaccg tgcaggacct gctgagcagc gccagcgtgt ggctgttccg gagcgacttc      780

gtgaaggact acccccggcc catcatgccc aacatggtgt tcgtgggggg gatcaactgc      840

ctgcaccaga accccctgag ccaggagttc gaggcctaca tcaacgccag cggggagcac      900

gggatcgtgg tgttcagcct ggggagcatg gtgagcgaga tccccgagaa gaaggccatg      960

gccatcgccg acgccctggg gaagatcccc cagaccgtgc tgtggcggta caccgggacc     1020

cggcccagca acctggccaa caacaccatc ctggtgaagt ggctgcccca gaacgacctg     1080

ctggggcacc ccatgacccg ggccttcatc acccacgccg ggagccacgg ggtgtacgag     1140

agcatctgca acggggtgcc catggtgatg atgcccctgt tcggggacca gatggacaac     1200

gccaagcgga tggagaccaa gggggccggg gtgaccctga acgtgctgga gatgaccagc     1260

gaggacctgg agaacgccct gaaggccgtg atcaacgaca agagctacaa ggagaacatc     1320

atgcggctga gcagcctgca caaggaccgg cccgtggagc ccctggacct ggccgtgttc     1380

tgggtggagt tcgtgatgcg gcacaagggg gccccccacc tgcggcccgc cgcccacgac     1440

ctgacctggt accagtacca cagcctggac gtgatcgggt tcctgctggc cgtggtgctg     1500

accgtggcct tcatcacctt caagtgctgc gcctacgggt accggaagtg cctggggaag     1560

aaggggcggg tgaagaaggc ccacaagagc aagacccac                            1599


<210>  2
<211>  1599
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  Engineered UGT1A1 sequence U011TY

<400>  2
atggctgtgg aaagccaggg cggccggccc ctggtgctgg gcctgctgct gtgtgtgctg       60

ggccccgtgg tgagccacgc tggcaagatt ctgctgattc ccgtggacgg cagccactgg      120

ctgagcatgc tgggcgctat tcagcagctg cagcagcggg gccacgaaat tgtggtgctg      180

gctcccgacg ctagcctgta cattcgggac ggcgcttttt acaccctgaa gacctacccc      240

gtgccctttc agcgggaaga cgtgaaggaa agctttgtga gcctgggcca caacgtgttt      300

gaaaacgaca gctttctgca gcgggtgatt aagacctaca agaagattaa gaaggacagc      360

gctatgctgc tgagcggctg tagccacctg ctgcacaaca aggaactgat ggctagcctg      420

gctgaaagca gctttgacgt gatgctgacc gacccctttc tgccctgtag ccccattgtg      480

gctcagtacc tgagcctgcc caccgtgttt tttctgcacg ctctgccctg tagcctggaa      540

tttgaagcta cccagtgtcc caaccccttt agctacgtgc cccggcccct gagcagccac      600

agcgaccaca tgacctttct gcagcgggtg aagaacatgc tgattgcttt tagccagaac      660

tttctgtgtg acgtggtgta cagcccctac gctaccctgg ctagcgaatt tctgcagcgg      720

gaagtgaccg tgcaggacct gctgagcagc gctagcgtgt ggctgtttcg gagcgacttt      780

gtgaaggact acccccggcc cattatgccc aacatggtgt ttgtgggcgg cattaactgt      840

ctgcaccaga accccctgag ccaggaattt gaagcttaca ttaacgctag cggcgaacac      900

ggcattgtgg tgtttagcct gggcagcatg gtgagcgaaa ttcccgaaaa gaaggctatg      960

gctattgctg acgctctggg caagattccc cagaccgtgc tgtggcggta caccggcacc     1020

cggcccagca acctggctaa caacaccatt ctggtgaagt ggctgcccca gaacgacctg     1080

ctgggccacc ccatgacccg ggcttttatt acccacgctg gcagccacgg cgtgtacgaa     1140

agcatttgta acggcgtgcc catggtgatg atgcccctgt ttggcgacca gatggacaac     1200

gctaagcgga tggaaaccaa gggcgctggc gtgaccctga acgtgctgga aatgaccagc     1260

gaagacctgg aaaacgctct gaaggctgtg attaacgaca agagctacaa ggaaaacatt     1320

atgcggctga gcagcctgca caaggaccgg cccgtggaac ccctggacct ggctgtgttt     1380

tgggtggaat ttgtgatgcg gcacaagggc gctccccacc tgcggcccgc tgctcacgac     1440

ctgacctggt accagtacca cagcctggac gtgattggct ttctgctggc tgtggtgctg     1500

accgtggctt ttattacctt taagtgttgt gcttacggct accggaagtg tctgggcaag     1560

aagggccggg tgaagaaggc tcacaagagc aagacccac                            1599


<210>  3
<211>  1599
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  Engineered UGT1A1 sequence U201DP

<400>  3
atggccgtgg agagccaggg aggacggcct ctggtgctgg gactgctgct gtgcgtgctg       60

ggacctgtgg tgagccacgc cggaaagatc ctgctgatcc ctgtggacgg aagccactgg      120

ctgagcatgc tgggagccat ccagcagctg cagcagcggg gacacgagat cgtggtgctg      180

gcccctgacg ccagcctgta catccgggac ggagccttct acaccctgaa gacctaccct      240

gtgcctttcc agcgggagga cgtgaaggag agcttcgtga gcctgggaca caacgtgttc      300

gagaacgaca gcttcctgca gcgggtgatc aagacctaca agaagatcaa gaaggacagc      360

gccatgctgc tgagcggctg cagccacctg ctgcacaaca aggagctgat ggccagcctg      420

gccgagagca gcttcgacgt gatgctgacc gaccctttcc tgccttgcag ccctatcgtg      480

gcccagtacc tgagcctgcc taccgtgttc ttcctgcacg ccctgccttg cagcctggag      540

ttcgaggcca cccagtgccc taaccctttc agctacgtgc ctcggcctct gagcagccac      600

agcgaccaca tgaccttcct gcagcgggtg aagaacatgc tgatcgcctt cagccagaac      660

ttcctgtgcg acgtggtgta cagcccttac gccaccctgg ccagcgagtt cctgcagcgg      720

gaggtgaccg tgcaggacct gctgagcagc gccagcgtgt ggctgttccg gagcgacttc      780

gtgaaggact accctcggcc tatcatgcct aacatggtgt tcgtgggagg aatcaactgc      840

ctgcaccaga accctctgag ccaggagttc gaggcctaca tcaacgccag cggagagcac      900

ggaatcgtgg tgttcagcct gggaagcatg gtgagcgaga tccctgagaa gaaggccatg      960

gccatcgccg acgccctggg aaagatccct cagaccgtgc tgtggcggta caccggaacc     1020

cggcctagca acctggccaa caacaccatc ctggtgaagt ggctgcctca gaacgacctg     1080

ctgggacacc ctatgacccg ggccttcatc acccacgccg gaagccacgg agtgtacgag     1140

agcatctgca acggagtgcc tatggtgatg atgcctctgt tcggagacca gatggacaac     1200

gccaagcgga tggagaccaa gggagccgga gtgaccctga acgtgctgga gatgaccagc     1260

gaggacctgg agaacgccct gaaggccgtg atcaacgaca agagctacaa ggagaacatc     1320

atgcggctga gcagcctgca caaggaccgg cctgtggagc ctctggacct ggccgtgttc     1380

tgggtggagt tcgtgatgcg gcacaaggga gcccctcacc tgcggcctgc cgcccacgac     1440

ctgacctggt accagtacca cagcctggac gtgatcggat tcctgctggc cgtggtgctg     1500

accgtggcct tcatcacctt caagtgctgc gcctacggat accggaagtg cctgggaaag     1560

aagggacggg tgaagaaggc ccacaagagc aagacccac                            1599


<210>  4
<211>  1599
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  Engineered UGT1A1 sequence

<400>  4
atggccgtgg aatctcaggg cggcagacct ctggtgctgg gcctgctgct gtgtgtgctg       60

ggacctgtgg tgtctcacgc cggcaagatc ctgctgatcc ccgtggacgg cagccactgg      120

ctgtctatgc tgggcgccat tcagcagctg cagcagaggg gccacgagat cgtggtgctg      180

gcccctgacg ccagcctgta catcagagat ggcgccttct acaccctgaa aacctacccc      240

gtgcccttcc agcgcgagga cgtgaaagaa agcttcgtgt ccctgggcca caacgtgttc      300

gagaacgaca gcttcctgca gagagtgatc aagacctaca agaagatcaa gaaagacagc      360

gccatgctgc tgagcggctg ctcccatctg ctgcacaaca aagaactgat ggcctccctg      420

gccgagagca gcttcgacgt gatgctgacc gacccattcc tgccctgcag ccctatcgtg      480

gcccagtacc tgagcctgcc taccgtgttc ttcctgcacg ccctgccttg ctccctggaa      540

ttcgaggcca cccagtgccc caaccccttc agctacgtgc ccagaccact gagcagccac      600

agcgaccaca tgacctttct gcagcgcgtg aagaacatgc tgatcgcctt cagccagaac      660

ttcctgtgcg acgtggtgta cagcccctac gctaccctgg ccagcgaatt cctgcagcgg      720

gaagtgaccg tgcaggacct gctgtctagc gccagcgtgt ggctgttccg cagcgacttc      780

gtgaaggact accccagacc catcatgccc aacatggtgt tcgtgggcgg catcaactgc      840

ctgcaccaga accccctgag ccaggaattt gaggcctaca tcaacgccag cggcgagcac      900

ggcatcgtgg tgtttagcct gggcagcatg gtgtccgaga tccccgagaa aaaggccatg      960

gctatcgccg acgccctggg aaagatcccc cagacagtgc tgtggcggta caccggcacc     1020

agacccagca acctggccaa caacaccatc ctcgtgaaat ggctgcccca gaacgacctg     1080

ctgggccacc ctatgacccg ggcctttatc acacacgccg gctcccacgg cgtgtacgag     1140

agcatctgca acggcgtgcc catggtcatg atgcccctgt tcggcgacca gatggacaac     1200

gccaagcgga tggaaacaaa gggcgctggc gtgaccctga acgtgctgga aatgaccagc     1260

gaggacctgg aaaacgccct gaaggccgtg atcaacgaca agagctacaa agaaaacatc     1320

atgcggctgt ccagcctgca caaggacaga cccgtggaac ccctggacct ggccgtgttc     1380

tgggtggaat tcgtgatgcg gcacaagggc gctccccatc tgaggcctgc agctcacgac     1440

ctgacctggt atcagtacca cagcctggac gtgatcggct tcctgctggc agtggtgctg     1500

accgtggcct tcatcacctt caagtgctgc gcctacggct accggaagtg cctgggcaag     1560

aaaggcagag tgaagaaggc ccacaagagc aagacccac                            1599


<210>  5
<211>  1602
<212>  DNA
<213>  Homo sapiens


<220>
<221>  CDS
<222>  (1)..(1602)

<400>  5
atg gct gtg gag tcc cag ggc gga cgc cca ctt gtc ctg ggc ctg ctg         48
Met Ala Val Glu Ser Gln Gly Gly Arg Pro Leu Val Leu Gly Leu Leu           
1               5                   10                  15                

ctg tgt gtg ctg ggc cca gtg gtg tcc cat gct ggg aag ata ctg ttg         96
Leu Cys Val Leu Gly Pro Val Val Ser His Ala Gly Lys Ile Leu Leu           
            20                  25                  30                    

atc cca gtg gat ggc agc cac tgg ctg agc atg ctt ggg gcc atc cag        144
Ile Pro Val Asp Gly Ser His Trp Leu Ser Met Leu Gly Ala Ile Gln           
        35                  40                  45                        

cag ctg cag cag agg gga cat gaa ata gtt gtc cta gca cct gac gcc        192
Gln Leu Gln Gln Arg Gly His Glu Ile Val Val Leu Ala Pro Asp Ala           
    50                  55                  60                            

tcg ttg tac atc aga gac gga gca ttt tac acc ttg aag acg tac cct        240
Ser Leu Tyr Ile Arg Asp Gly Ala Phe Tyr Thr Leu Lys Thr Tyr Pro           
65                  70                  75                  80            

gtg cca ttc caa agg gag gat gtg aaa gag tct ttt gtt agt ctc ggg        288
Val Pro Phe Gln Arg Glu Asp Val Lys Glu Ser Phe Val Ser Leu Gly           
                85                  90                  95                

cat aat gtt ttt gag aat gat tct ttc ctg cag cgt gtg atc aaa aca        336
His Asn Val Phe Glu Asn Asp Ser Phe Leu Gln Arg Val Ile Lys Thr           
            100                 105                 110                   

tac aag aaa ata aaa aag gac tct gct atg ctt ttg tct ggc tgt tcc        384
Tyr Lys Lys Ile Lys Lys Asp Ser Ala Met Leu Leu Ser Gly Cys Ser           
        115                 120                 125                       

cac tta ctg cac aac aag gag ctc atg gcc tcc ctg gca gaa agc agc        432
His Leu Leu His Asn Lys Glu Leu Met Ala Ser Leu Ala Glu Ser Ser           
    130                 135                 140                           

ttt gat gtc atg ctg acg gac cct ttc ctt cct tgc agc ccc atc gtg        480
Phe Asp Val Met Leu Thr Asp Pro Phe Leu Pro Cys Ser Pro Ile Val           
145                 150                 155                 160           

gcc cag tac ctg tct ctg ccc act gta ttc ttc ttg cat gca ctg cca        528
Ala Gln Tyr Leu Ser Leu Pro Thr Val Phe Phe Leu His Ala Leu Pro           
                165                 170                 175               

tgc agc ctg gaa ttt gag gct acc cag tgc ccc aac cca ttc tcc tac        576
Cys Ser Leu Glu Phe Glu Ala Thr Gln Cys Pro Asn Pro Phe Ser Tyr           
            180                 185                 190                   

gtg ccc agg cct ctc tcc tct cat tca gat cac atg acc ttc ctg cag        624
Val Pro Arg Pro Leu Ser Ser His Ser Asp His Met Thr Phe Leu Gln           
        195                 200                 205                       

cgg gtg aag aac atg ctc att gcc ttt tca cag aac ttt ctg tgc gac        672
Arg Val Lys Asn Met Leu Ile Ala Phe Ser Gln Asn Phe Leu Cys Asp           
    210                 215                 220                           

gtg gtt tat tcc ccg tat gca acc ctt gcc tca gaa ttc ctt cag aga        720
Val Val Tyr Ser Pro Tyr Ala Thr Leu Ala Ser Glu Phe Leu Gln Arg           
225                 230                 235                 240           

gag gtg act gtc cag gac cta ttg agc tct gca tct gtc tgg ctg ttt        768
Glu Val Thr Val Gln Asp Leu Leu Ser Ser Ala Ser Val Trp Leu Phe           
                245                 250                 255               

aga agt gac ttt gtg aag gat tac cct agg ccc atc atg ccc aat atg        816
Arg Ser Asp Phe Val Lys Asp Tyr Pro Arg Pro Ile Met Pro Asn Met           
            260                 265                 270                   

gtt ttt gtt ggt gga atc aac tgc ctt cac caa aat cca cta tcc cag        864
Val Phe Val Gly Gly Ile Asn Cys Leu His Gln Asn Pro Leu Ser Gln           
        275                 280                 285                       

gaa ttt gaa gcc tac att aat gct tct gga gaa cat gga att gtg gtt        912
Glu Phe Glu Ala Tyr Ile Asn Ala Ser Gly Glu His Gly Ile Val Val           
    290                 295                 300                           

ttc tct ttg gga tca atg gtc tca gaa att cca gag aag aaa gct atg        960
Phe Ser Leu Gly Ser Met Val Ser Glu Ile Pro Glu Lys Lys Ala Met           
305                 310                 315                 320           

gca att gct gat gct ttg ggc aaa atc cct cag aca gtc ctg tgg cgg       1008
Ala Ile Ala Asp Ala Leu Gly Lys Ile Pro Gln Thr Val Leu Trp Arg           
                325                 330                 335               

tac act gga acc cga cca tcg aat ctt gcg aac aac acg ata ctt gtt       1056
Tyr Thr Gly Thr Arg Pro Ser Asn Leu Ala Asn Asn Thr Ile Leu Val           
            340                 345                 350                   

aag tgg cta ccc caa aac gat ctg ctt ggt cac ccg atg acc cgt gcc       1104
Lys Trp Leu Pro Gln Asn Asp Leu Leu Gly His Pro Met Thr Arg Ala           
        355                 360                 365                       

ttt atc acc cat gct ggt tcc cat ggt gtt tat gaa agc ata tgc aat       1152
Phe Ile Thr His Ala Gly Ser His Gly Val Tyr Glu Ser Ile Cys Asn           
    370                 375                 380                           

ggc gtt ccc atg gtg atg atg ccc ttg ttt ggt gat cag atg gac aat       1200
Gly Val Pro Met Val Met Met Pro Leu Phe Gly Asp Gln Met Asp Asn           
385                 390                 395                 400           

gca aag cgc atg gag act aag gga gct gga gtg acc ctg aat gtt ctg       1248
Ala Lys Arg Met Glu Thr Lys Gly Ala Gly Val Thr Leu Asn Val Leu           
                405                 410                 415               

gaa atg act tct gaa gat tta gaa aat gct cta aaa gca gtc atc aat       1296
Glu Met Thr Ser Glu Asp Leu Glu Asn Ala Leu Lys Ala Val Ile Asn           
            420                 425                 430                   

gac aaa agt tac aag gag aac atc atg cgc ctc tcc agc ctt cac aag       1344
Asp Lys Ser Tyr Lys Glu Asn Ile Met Arg Leu Ser Ser Leu His Lys           
        435                 440                 445                       

gac cgc ccg gtg gag ccg ctg gac ctg gcc gtg ttc tgg gtg gag ttt       1392
Asp Arg Pro Val Glu Pro Leu Asp Leu Ala Val Phe Trp Val Glu Phe           
    450                 455                 460                           

gtg atg agg cac aag ggc gcg cca cac ctg cgc ccc gca gcc cac gac       1440
Val Met Arg His Lys Gly Ala Pro His Leu Arg Pro Ala Ala His Asp           
465                 470                 475                 480           

ctc acc tgg tac cag tac cat tcc ttg gac gtg att ggt ttc ctc ttg       1488
Leu Thr Trp Tyr Gln Tyr His Ser Leu Asp Val Ile Gly Phe Leu Leu           
                485                 490                 495               

gcc gtc gtg ctg aca gtg gcc ttc atc acc ttt aaa tgt tgt gct tat       1536
Ala Val Val Leu Thr Val Ala Phe Ile Thr Phe Lys Cys Cys Ala Tyr           
            500                 505                 510                   

ggc tac cgg aaa tgc ttg ggg aaa aaa ggg cga gtt aag aaa gcc cac       1584
Gly Tyr Arg Lys Cys Leu Gly Lys Lys Gly Arg Val Lys Lys Ala His           
        515                 520                 525                       

aaa tcc aag acc cat tga                                               1602
Lys Ser Lys Thr His                                                       
    530                                                                   


<210>  6
<211>  533
<212>  PRT
<213>  Homo sapiens

<400>  6

Met Ala Val Glu Ser Gln Gly Gly Arg Pro Leu Val Leu Gly Leu Leu 
1               5                   10                  15      


Leu Cys Val Leu Gly Pro Val Val Ser His Ala Gly Lys Ile Leu Leu 
            20                  25                  30          


Ile Pro Val Asp Gly Ser His Trp Leu Ser Met Leu Gly Ala Ile Gln 
        35                  40                  45              


Gln Leu Gln Gln Arg Gly His Glu Ile Val Val Leu Ala Pro Asp Ala 
    50                  55                  60                  


Ser Leu Tyr Ile Arg Asp Gly Ala Phe Tyr Thr Leu Lys Thr Tyr Pro 
65                  70                  75                  80  


Val Pro Phe Gln Arg Glu Asp Val Lys Glu Ser Phe Val Ser Leu Gly 
                85                  90                  95      


His Asn Val Phe Glu Asn Asp Ser Phe Leu Gln Arg Val Ile Lys Thr 
            100                 105                 110         


Tyr Lys Lys Ile Lys Lys Asp Ser Ala Met Leu Leu Ser Gly Cys Ser 
        115                 120                 125             


His Leu Leu His Asn Lys Glu Leu Met Ala Ser Leu Ala Glu Ser Ser 
    130                 135                 140                 


Phe Asp Val Met Leu Thr Asp Pro Phe Leu Pro Cys Ser Pro Ile Val 
145                 150                 155                 160 


Ala Gln Tyr Leu Ser Leu Pro Thr Val Phe Phe Leu His Ala Leu Pro 
                165                 170                 175     


Cys Ser Leu Glu Phe Glu Ala Thr Gln Cys Pro Asn Pro Phe Ser Tyr 
            180                 185                 190         


Val Pro Arg Pro Leu Ser Ser His Ser Asp His Met Thr Phe Leu Gln 
        195                 200                 205             


Arg Val Lys Asn Met Leu Ile Ala Phe Ser Gln Asn Phe Leu Cys Asp 
    210                 215                 220                 


Val Val Tyr Ser Pro Tyr Ala Thr Leu Ala Ser Glu Phe Leu Gln Arg 
225                 230                 235                 240 


Glu Val Thr Val Gln Asp Leu Leu Ser Ser Ala Ser Val Trp Leu Phe 
                245                 250                 255     


Arg Ser Asp Phe Val Lys Asp Tyr Pro Arg Pro Ile Met Pro Asn Met 
            260                 265                 270         


Val Phe Val Gly Gly Ile Asn Cys Leu His Gln Asn Pro Leu Ser Gln 
        275                 280                 285             


Glu Phe Glu Ala Tyr Ile Asn Ala Ser Gly Glu His Gly Ile Val Val 
    290                 295                 300                 


Phe Ser Leu Gly Ser Met Val Ser Glu Ile Pro Glu Lys Lys Ala Met 
305                 310                 315                 320 


Ala Ile Ala Asp Ala Leu Gly Lys Ile Pro Gln Thr Val Leu Trp Arg 
                325                 330                 335     


Tyr Thr Gly Thr Arg Pro Ser Asn Leu Ala Asn Asn Thr Ile Leu Val 
            340                 345                 350         


Lys Trp Leu Pro Gln Asn Asp Leu Leu Gly His Pro Met Thr Arg Ala 
        355                 360                 365             


Phe Ile Thr His Ala Gly Ser His Gly Val Tyr Glu Ser Ile Cys Asn 
    370                 375                 380                 


Gly Val Pro Met Val Met Met Pro Leu Phe Gly Asp Gln Met Asp Asn 
385                 390                 395                 400 


Ala Lys Arg Met Glu Thr Lys Gly Ala Gly Val Thr Leu Asn Val Leu 
                405                 410                 415     


Glu Met Thr Ser Glu Asp Leu Glu Asn Ala Leu Lys Ala Val Ile Asn 
            420                 425                 430         


Asp Lys Ser Tyr Lys Glu Asn Ile Met Arg Leu Ser Ser Leu His Lys 
        435                 440                 445             


Asp Arg Pro Val Glu Pro Leu Asp Leu Ala Val Phe Trp Val Glu Phe 
    450                 455                 460                 


Val Met Arg His Lys Gly Ala Pro His Leu Arg Pro Ala Ala His Asp 
465                 470                 475                 480 


Leu Thr Trp Tyr Gln Tyr His Ser Leu Asp Val Ile Gly Phe Leu Leu 
                485                 490                 495     


Ala Val Val Leu Thr Val Ala Phe Ile Thr Phe Lys Cys Cys Ala Tyr 
            500                 505                 510         


Gly Tyr Arg Lys Cys Leu Gly Lys Lys Gly Arg Val Lys Lys Ala His 
        515                 520                 525             


Lys Ser Lys Thr His 
    530             


<210>  7
<211>  1602
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  optimized UGT1A1 v2.1

<400>  7
atggctgtgg aatcacaagg aggtagacca ctggttctcg gacttttgct ttgcgtgctg       60

gggcccgtgg tgtcgcatgc cggaaagatc ctgctgatcc cggtggatgg atcacactgg      120

ctgtccatgc tgggtgccat ccaacagctc cagcagcggg gccacgaaat tgtggtcctg      180

gccccggacg cttccctgta tattcgggac ggagcgttct acactctcaa gacctaccct      240

gtccccttcc aaagggagga cgtgaaggaa agctttgtgt cgctggggca taatgtgttc      300

gagaacgaca gcttcctcca aagggttatt aaaacctaca agaagatcaa aaaggattcg      360

gccatgctcc tttccggatg ttcacacctg ttgcataaca aggaattgat ggccagcctg      420

gcagaatcca gctttgacgt catgcttact gacccgttct tgccttgctc cccgattgtg      480

gcccaatacc tgtcgctccc aaccgtgttc ttcctgcacg ccttgccttg ttcgctggaa      540

ttcgaagcga ctcagtgtcc caatccgttc tcctacgtcc cgcgcccgct ttcaagccat      600

tcggatcaca tgactttcct ccagcgcgtc aagaacatgc tcattgcgtt cagccagaac      660

tttctgtgcg acgtggttta ctcaccttac gctaccttgg cttctgagtt cctgcagaga      720

gaagtgactg tgcaagatct gctgtcctca gcgtccgttt ggttgttccg gtctgacttc      780

gtcaaggact acccgcgccc gatcatgccg aatatggtct ttgtgggcgg tatcaactgc      840

ctgcatcaaa acccactgag ccaggagttt gaggcgtaca tcaacgcctc gggagagcat      900

ggaatcgtgg tgttctccct cggttccatg gtgtccgaga tcccggaaaa gaaggcaatg      960

gccatcgcag atgccctggg caaaatcccg cagaccgtgc tctggcgcta cacgggtact     1020

cggcctagca atttggcaaa caacaccatc ctggtgaaat ggctgccgca gaacgacctc     1080

ctgggccacc caatgactcg cgctttcatt acccatgcgg gctcgcacgg agtctacgaa     1140

tccatctgca atggagtccc gatggtgatg atgccacttt tcggagatca gatggataat     1200

gcaaaaagaa tggaaaccaa gggggccgga gtgacgctga acgtgcttga aatgacctcg     1260

gaagatctgg agaacgctct caaagcggtg atcaacgaca agtcctacaa ggaaaacatc     1320

atgcgcctga gctccctcca caaggaccga ccagtggaac cgctggacct cgcggtcttt     1380

tgggtggagt tcgtgatgag gcacaagggc gccccccacc tcagacccgc agctcatgac     1440

ctcacttggt accagtacca ttcgctggat gtcatcggct ttctcctggc ggtcgtgctc     1500

accgtggcgt tcatcacctt caagtgctgc gcctacggat atcgcaaatg cttggggaag     1560

aaaggacggg tgaagaaggc acacaagtca aagacgcact ga                        1602


<210>  8
<211>  1602
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  optimized UGT1A1 v3

<400>  8
atggccgtgg aatctcaggg cggcagacct ctggtgctgg gcctgctgct gtgtgtgctg       60

ggacctgtgg tgtctcacgc cggcaagatc ctgctgatcc ccgtggatgg cagccactgg      120

ctgtctatgc tgggcgccat tcagcagctg cagcagaggg gccacgagat cgtggtgctg      180

gcccctgatg ccagcctgta catcagagat ggcgccttct acaccctgaa aacctacccc      240

gtgcccttcc agcgcgagga cgtgaaagaa agcttcgtgt ccctgggcca caacgtgttc      300

gagaacgaca gcttcctgca gagagtgatc aagacctaca agaagatcaa gaaagacagc      360

gccatgctgc tgagcggctg ctcccatctg ctgcacaaca aagaactgat ggcctccctg      420

gccgagagca gcttcgacgt gatgctgacc gacccattcc tgccctgcag ccctatcgtg      480

gcccagtacc tgagcctgcc taccgtgttc ttcctgcacg ccctgccttg ctccctggaa      540

ttcgaggcca cccagtgccc caaccccttc agctacgtgc ccagaccact gagcagccac      600

agcgaccaca tgacctttct gcagcgcgtg aagaacatgc tgatcgcctt cagccagaac      660

ttcctgtgcg acgtggtgta cagcccctac gctaccctgg ccagcgaatt cctgcagcgg      720

gaagtgaccg tgcaggacct gctgtctagc gccagcgtgt ggctgttccg cagcgacttc      780

gtgaaggact accccagacc catcatgccc aacatggtgt tcgtgggcgg catcaactgc      840

ctgcaccaga accccctgag ccaggaattt gaggcctaca tcaacgccag cggcgagcac      900

ggcatcgtgg tgtttagcct gggcagcatg gtgtccgaga tccccgagaa aaaggccatg      960

gctatcgccg acgccctggg aaagatcccc cagacagtgc tgtggcggta caccggcacc     1020

agacccagca acctggccaa caacaccatc ctcgtgaaat ggctgcccca gaacgacctg     1080

ctgggccacc ctatgacccg ggcctttatc acacacgccg gctcccatgg cgtgtacgag     1140

agcatctgca acggcgtgcc catggtcatg atgcccctgt tcggcgacca gatggacaac     1200

gccaagcgga tggaaacaaa gggcgctggc gtgaccctga acgtgctgga aatgaccagc     1260

gaggacctgg aaaacgccct gaaggccgtg atcaacgaca agagctacaa agaaaacatc     1320

atgcggctgt ccagcctgca caaggacaga cccgtggaac ccctggacct ggccgtgttc     1380

tgggtggaat tcgtgatgcg gcacaagggc gctccccatc tgaggcctgc agctcacgac     1440

ctgacctggt atcagtacca cagcctggac gtgatcggct tcctgctggc agtggtgctg     1500

accgtggcct tcatcacctt caagtgctgc gcctacggct accggaagtg cctgggcaag     1560

aaaggcagag tgaagaaggc ccacaagagc aagacccact ga                        1602


<210>  9
<211>  6498
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  pAAV.TBG.hUGT1A1co.WRPE.BGH (p3793) vector


<220>
<221>  repeat_region
<222>  (1)..(168)
<223>  5' ITR

<220>
<221>  enhancer
<222>  (211)..(310)
<223>  alpha mic/bik

<220>
<221>  enhancer
<222>  (317)..(416)
<223>  alpha mic/bik

<220>
<221>  promoter
<222>  (431)..(907)
<223>  TBG promoter

<220>
<221>  Intron
<222>  (939)..(1071)
<223>  SV40 misc intron (Promega)

<220>
<221>  misc_feature
<222>  (1086)..(1091)
<223>  Kozak

<220>
<221>  misc_feature
<222>  (1092)..(2690)
<223>  UGT1A1 CDS

<220>
<221>  misc_feature
<222>  (2709)..(3250)

<220>
<221>  polyA_signal
<222>  (3257)..(3471)
<223>  BGH polyA

<220>
<221>  repeat_region
<222>  (3521)..(3558)
<223>  3' ITR

<220>
<221>  misc_feature
<222>  (4451)..(5308)
<223>  Amp-R CDS

<220>
<221>  misc_feature
<222>  (5482)..(6070)
<223>  origin

<400>  9
ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt       60

ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact      120

aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct      180

aggaagatcg gaattcgccc ttaagctagc aggttaattt ttaaaaagca gtcaaaagtc      240

caagtggccc ttggcagcat ttactctctc tgtttgctct ggttaataat ctcaggagca      300

caaacattcc agatccaggt taatttttaa aaagcagtca aaagtccaag tggcccttgg      360

cagcatttac tctctctgtt tgctctggtt aataatctca ggagcacaaa cattccagat      420

ccggcgcgcc agggctggaa gctacctttg acatcatttc ctctgcgaat gcatgtataa      480

tttctacaga acctattaga aaggatcacc cagcctctgc ttttgtacaa ctttccctta      540

aaaaactgcc aattccactg ctgtttggcc caatagtgag aactttttcc tgctgcctct      600

tggtgctttt gcctatggcc cctattctgc ctgctgaaga cactcttgcc agcatggact      660

taaacccctc cagctctgac aatcctcttt ctcttttgtt ttacatgaag ggtctggcag      720

ccaaagcaat cactcaaagt tcaaacctta tcattttttg ctttgttcct cttggccttg      780

gttttgtaca tcagctttga aaataccatc ccagggttaa tgctggggtt aatttataac      840

taagagtgct ctagttttgc aatacaggac atgctataaa aatggaaaga tgttgctttc      900

tgagagactg cagaagttgg tcgtgaggca ctgggcaggt aagtatcaag gttacaagac      960

aggtttaagg agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct     1020

gataggcacc tattggtctt actgacatcc actttgcctt tctctccaca ggtgtccagg     1080

cggccgccac catggccgtg gaatctcagg gcggcagacc tctggtgctg ggcctgctgc     1140

tgtgtgtgct gggacctgtg gtgtctcacg ccggcaagat cctgctgatc cccgtggacg     1200

gcagccactg gctgtctatg ctgggcgcca ttcagcagct gcagcagagg ggccacgaga     1260

tcgtggtgct ggcccctgac gccagcctgt acatcagaga tggcgccttc tacaccctga     1320

aaacctaccc cgtgcccttc cagcgcgagg acgtgaaaga aagcttcgtg tccctgggcc     1380

acaacgtgtt cgagaacgac agcttcctgc agagagtgat caagacctac aagaagatca     1440

agaaagacag cgccatgctg ctgagcggct gctcccatct gctgcacaac aaagaactga     1500

tggcctccct ggccgagagc agcttcgacg tgatgctgac cgacccattc ctgccctgca     1560

gccctatcgt ggcccagtac ctgagcctgc ctaccgtgtt cttcctgcac gccctgcctt     1620

gctccctgga attcgaggcc acccagtgcc ccaacccctt cagctacgtg cccagaccac     1680

tgagcagcca cagcgaccac atgacctttc tgcagcgcgt gaagaacatg ctgatcgcct     1740

tcagccagaa cttcctgtgc gacgtggtgt acagccccta cgctaccctg gccagcgaat     1800

tcctgcagcg ggaagtgacc gtgcaggacc tgctgtctag cgccagcgtg tggctgttcc     1860

gcagcgactt cgtgaaggac taccccagac ccatcatgcc caacatggtg ttcgtgggcg     1920

gcatcaactg cctgcaccag aaccccctga gccaggaatt tgaggcctac atcaacgcca     1980

gcggcgagca cggcatcgtg gtgtttagcc tgggcagcat ggtgtccgag atccccgaga     2040

aaaaggccat ggctatcgcc gacgccctgg gaaagatccc ccagacagtg ctgtggcggt     2100

acaccggcac cagacccagc aacctggcca acaacaccat cctcgtgaaa tggctgcccc     2160

agaacgacct gctgggccac cctatgaccc gggcctttat cacacacgcc ggctcccacg     2220

gcgtgtacga gagcatctgc aacggcgtgc ccatggtcat gatgcccctg ttcggcgacc     2280

agatggacaa cgccaagcgg atggaaacaa agggcgctgg cgtgaccctg aacgtgctgg     2340

aaatgaccag cgaggacctg gaaaacgccc tgaaggccgt gatcaacgac aagagctaca     2400

aagaaaacat catgcggctg tccagcctgc acaaggacag acccgtggaa cccctggacc     2460

tggccgtgtt ctgggtggaa ttcgtgatgc ggcacaaggg cgctccccat ctgaggcctg     2520

cagctcacga cctgacctgg tatcagtacc acagcctgga cgtgatcggc ttcctgctgg     2580

cagtggtgct gaccgtggcc ttcatcacct tcaagtgctg cgcctacggc taccggaagt     2640

gcctgggcaa gaaaggcaga gtgaagaagg cccacaagag caagacccac tgataagcat     2700

gcggatccaa tcaacctctg gattacaaaa tttgtgaaag attgactggt attcttaact     2760

atgttgctcc ttttacgcta tgtggatacg ctgctttaat gcctttgtat catgctattg     2820

cttcccgtat ggctttcatt ttctcctcct tgtataaatc ctggttgctg tctctttatg     2880

aggagttgtg gcccgttgtc aggcaacgtg gcgtggtgtg cactgtgttt gctgacgcaa     2940

cccccactgg ttggggcatt gccaccacct gtcagctcct ttccgggact ttcgctttcc     3000

ccctccctat tgccacggcg gaactcatcg ccgcctgcct tgcccgctgc tggacagggg     3060

ctcggctgtt gggcactgac aattccgtgg tgttgtcggg gaagctgacg tcctttccat     3120

ggctgctcgc ctgtgttgcc acctggattc tgcgcgggac gtccttctgc tacgtccctt     3180

cggccctcaa tccagcggac cttccttccc gcggcctgct gccggctctg cggcctcttc     3240

cgcgtcttcg agatctgcct cgactgtgcc ttctagttgc cagccatctg ttgtttgccc     3300

ctcccccgtg ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa     3360

tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg     3420

gcaggacagc aagggggagg attgggaaga caatagcagg catgctgggg actcgagtta     3480

agggcgaatt cccgattagg atcttcctag agcatggcta cgtagataag tagcatggcg     3540

ggttaatcat taactacaag gaacccctag tgatggagtt ggccactccc tctctgcgcg     3600

ctcgctcgct cactgaggcc gggcgaccaa aggtcgcccg acgcccgggc tttgcccggg     3660

cggcctcagt gagcgagcga gcgcgcagcc ttaattaacc taattcactg gccgtcgttt     3720

tacaacgtcg tgactgggaa aaccctggcg ttacccaact taatcgcctt gcagcacatc     3780

cccctttcgc cagctggcgt aatagcgaag aggcccgcac cgatcgccct tcccaacagt     3840

tgcgcagcct gaatggcgaa tgggacgcgc cctgtagcgg cgcattaagc gcggcgggtg     3900

tggtggttac gcgcagcgtg accgctacac ttgccagcgc cctagcgccc gctcctttcg     3960

ctttcttccc ttcctttctc gccacgttcg ccggctttcc ccgtcaagct ctaaatcggg     4020

ggctcccttt agggttccga tttagtgctt tacggcacct cgaccccaaa aaacttgatt     4080

agggtgatgg ttcacgtagt gggccatcgc cctgatagac ggtttttcgc cctttgacgt     4140

tggagtccac gttctttaat agtggactct tgttccaaac tggaacaaca ctcaacccta     4200

tctcggtcta ttcttttgat ttataaggga ttttgccgat ttcggcctat tggttaaaaa     4260

atgagctgat ttaacaaaaa tttaacgcga attttaacaa aatattaacg cttacaattt     4320

aggtggcact tttcggggaa atgtgcgcgg aacccctatt tgtttatttt tctaaataca     4380

ttcaaatatg tatccgctca tgagacaata accctgataa atgcttcaat aatattgaaa     4440

aaggaagagt atgagtattc aacatttccg tgtcgccctt attccctttt ttgcggcatt     4500

ttgccttcct gtttttgctc acccagaaac gctggtgaaa gtaaaagatg ctgaagatca     4560

gttgggtgca cgagtgggtt acatcgaact ggatctcaac agcggtaaga tccttgagag     4620

ttttcgcccc gaagaacgtt ttccaatgat gagcactttt aaagttctgc tatgtggcgc     4680

ggtattatcc cgtattgacg ccgggcaaga gcaactcggt cgccgcatac actattctca     4740

gaatgacttg gttgagtact caccagtcac agaaaagcat cttacggatg gcatgacagt     4800

aagagaatta tgcagtgctg ccataaccat gagtgataac actgcggcca acttacttct     4860

gacaacgatc ggaggaccga aggagctaac cgcttttttg cacaacatgg gggatcatgt     4920

aactcgcctt gatcgttggg aaccggagct gaatgaagcc ataccaaacg acgagcgtga     4980

caccacgatg cctgtagcaa tggcaacaac gttgcgcaaa ctattaactg gcgaactact     5040

tactctagct tcccggcaac aattaataga ctggatggag gcggataaag ttgcaggacc     5100

acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg gagccggtga     5160

gcgtgggtct cgcggtatca ttgcagcact ggggccagat ggtaagccct cccgtatcgt     5220

agttatctac acgacgggga gtcaggcaac tatggatgaa cgaaatagac agatcgctga     5280

gataggtgcc tcactgatta agcattggta actgtcagac caagtttact catatatact     5340

ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga tcctttttga     5400

taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt cagaccccgt     5460

agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct gctgcttgca     5520

aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc taccaactct     5580

ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgttc ttctagtgta     5640

gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc tcgctctgct     5700

aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg ggttggactc     5760

aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt cgtgcacaca     5820

gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg agctatgaga     5880

aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg     5940

aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt atagtcctgt     6000

cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag gggggcggag     6060

cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt gctggccttt     6120

tgctcacatg ttctttcctg cgttatcccc tgattctgtg gataaccgta ttaccgcctt     6180

tgagtgagct gataccgctc gccgcagccg aacgaccgag cgcagcgagt cagtgagcga     6240

ggaagcggaa gagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc cgattcatta     6300

atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca acgcaattaa     6360

tgtgagttag ctcactcatt aggcacccca ggctttacac tttatgcttc cggctcgtat     6420

gttgtgtgga attgtgagcg gataacaatt tcacacagga aacagctatg accatgatta     6480

cgccagattt aattaagg                                                   6498


<210>  10
<211>  5962
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  pAAV.TBG.hUGT1A1co.BGH


<220>
<221>  repeat_region
<222>  (1)..(168)
<223>  5' ITR

<220>
<221>  enhancer
<222>  (211)..(310)
<223>  alpha mc/bik

<220>
<221>  enhancer
<222>  (317)..(416)
<223>  alpha mc/bik

<220>
<221>  promoter
<222>  (431)..(907)
<223>  TBG promoter

<220>
<221>  Intron
<222>  (939)..(1071)
<223>  SV40 misc intron (Promega)

<220>
<221>  misc_feature
<222>  (1086)..(1091)
<223>  Kozak

<220>
<221>  misc_feature
<222>  (1092)..(2690)
<223>  UGT1A1 coding sequence

<220>
<221>  repeat_region
<222>  (2695)..(3152)
<223>  3' ITR

<220>
<221>  polyA_signal
<222>  (2721)..(2935)
<223>  BGH polyA

<220>
<221>  misc_feature
<222>  (3915)..(4772)
<223>  Amp-R coding sequence

<220>
<221>  misc_feature
<222>  (4946)..(4772)
<223>  origin

<400>  10
ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt       60

ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact      120

aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct      180

aggaagatcg gaattcgccc ttaagctagc aggttaattt ttaaaaagca gtcaaaagtc      240

caagtggccc ttggcagcat ttactctctc tgtttgctct ggttaataat ctcaggagca      300

caaacattcc agatccaggt taatttttaa aaagcagtca aaagtccaag tggcccttgg      360

cagcatttac tctctctgtt tgctctggtt aataatctca ggagcacaaa cattccagat      420

ccggcgcgcc agggctggaa gctacctttg acatcatttc ctctgcgaat gcatgtataa      480

tttctacaga acctattaga aaggatcacc cagcctctgc ttttgtacaa ctttccctta      540

aaaaactgcc aattccactg ctgtttggcc caatagtgag aactttttcc tgctgcctct      600

tggtgctttt gcctatggcc cctattctgc ctgctgaaga cactcttgcc agcatggact      660

taaacccctc cagctctgac aatcctcttt ctcttttgtt ttacatgaag ggtctggcag      720

ccaaagcaat cactcaaagt tcaaacctta tcattttttg ctttgttcct cttggccttg      780

gttttgtaca tcagctttga aaataccatc ccagggttaa tgctggggtt aatttataac      840

taagagtgct ctagttttgc aatacaggac atgctataaa aatggaaaga tgttgctttc      900

tgagagactg cagaagttgg tcgtgaggca ctgggcaggt aagtatcaag gttacaagac      960

aggtttaagg agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct     1020

gataggcacc tattggtctt actgacatcc actttgcctt tctctccaca ggtgtccagg     1080

cggccgccac catggccgtg gaatctcagg gcggcagacc tctggtgctg ggcctgctgc     1140

tgtgtgtgct gggacctgtg gtgtctcacg ccggcaagat cctgctgatc cccgtggatg     1200

gcagccactg gctgtctatg ctgggcgcca ttcagcagct gcagcagagg ggccacgaga     1260

tcgtggtgct ggcccctgat gccagcctgt acatcagaga tggcgccttc tacaccctga     1320

aaacctaccc cgtgcccttc cagcgcgagg acgtgaaaga aagcttcgtg tccctgggcc     1380

acaacgtgtt cgagaacgac agcttcctgc agagagtgat caagacctac aagaagatca     1440

agaaagacag cgccatgctg ctgagcggct gctcccatct gctgcacaac aaagaactga     1500

tggcctccct ggccgagagc agcttcgacg tgatgctgac cgacccattc ctgccctgca     1560

gccctatcgt ggcccagtac ctgagcctgc ctaccgtgtt cttcctgcac gccctgcctt     1620

gctccctgga attcgaggcc acccagtgcc ccaacccctt cagctacgtg cccagaccac     1680

tgagcagcca cagcgaccac atgacctttc tgcagcgcgt gaagaacatg ctgatcgcct     1740

tcagccagaa cttcctgtgc gacgtggtgt acagccccta cgctaccctg gccagcgaat     1800

tcctgcagcg ggaagtgacc gtgcaggacc tgctgtctag cgccagcgtg tggctgttcc     1860

gcagcgactt cgtgaaggac taccccagac ccatcatgcc caacatggtg ttcgtgggcg     1920

gcatcaactg cctgcaccag aaccccctga gccaggaatt tgaggcctac atcaacgcca     1980

gcggcgagca cggcatcgtg gtgtttagcc tgggcagcat ggtgtccgag atccccgaga     2040

aaaaggccat ggctatcgcc gacgccctgg gaaagatccc ccagacagtg ctgtggcggt     2100

acaccggcac cagacccagc aacctggcca acaacaccat cctcgtgaaa tggctgcccc     2160

agaacgacct gctgggccac cctatgaccc gggcctttat cacacacgcc ggctcccatg     2220

gcgtgtacga gagcatctgc aacggcgtgc ccatggtcat gatgcccctg ttcggcgacc     2280

agatggacaa cgccaagcgg atggaaacaa agggcgctgg cgtgaccctg aacgtgctgg     2340

aaatgaccag cgaggacctg gaaaacgccc tgaaggccgt gatcaacgac aagagctaca     2400

aagaaaacat catgcggctg tccagcctgc acaaggacag acccgtggaa cccctggacc     2460

tggccgtgtt ctgggtggaa ttcgtgatgc ggcacaaggg cgctccccat ctgaggcctg     2520

cagctcacga cctgacctgg tatcagtacc acagcctgga cgtgatcggc ttcctgctgg     2580

cagtggtgct gaccgtggcc ttcatcacct tcaagtgctg cgcctacggc taccggaagt     2640

gcctgggcaa gaaaggcaga gtgaagaagg cccacaagag caagacccac tgataagcat     2700

gcgtcgacgg atccagatct gcctcgactg tgccttctag ttgccagcca tctgttgttt     2760

gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc ctttcctaat     2820

aaaatgagga aattgcatcg cattgtctga gtaggtgtca ttctattctg gggggtgggg     2880

tggggcagga cagcaagggg gaggattggg aagacaatag caggcatgct ggggactcga     2940

gttaagggcg aattcccgat taggatcttc ctagagcatg gctacgtaga taagtagcat     3000

ggcgggttaa tcattaacta caaggaaccc ctagtgatgg agttggccac tccctctctg     3060

cgcgctcgct cgctcactga ggccgggcga ccaaaggtcg cccgacgccc gggctttgcc     3120

cgggcggcct cagtgagcga gcgagcgcgc agccttaatt aacctaattc actggccgtc     3180

gttttacaac gtcgtgactg ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca     3240

catccccctt tcgccagctg gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa     3300

cagttgcgca gcctgaatgg cgaatgggac gcgccctgta gcggcgcatt aagcgcggcg     3360

ggtgtggtgg ttacgcgcag cgtgaccgct acacttgcca gcgccctagc gcccgctcct     3420

ttcgctttct tcccttcctt tctcgccacg ttcgccggct ttccccgtca agctctaaat     3480

cgggggctcc ctttagggtt ccgatttagt gctttacggc acctcgaccc caaaaaactt     3540

gattagggtg atggttcacg tagtgggcca tcgccctgat agacggtttt tcgccctttg     3600

acgttggagt ccacgttctt taatagtgga ctcttgttcc aaactggaac aacactcaac     3660

cctatctcgg tctattcttt tgatttataa gggattttgc cgatttcggc ctattggtta     3720

aaaaatgagc tgatttaaca aaaatttaac gcgaatttta acaaaatatt aacgcttaca     3780

atttaggtgg cacttttcgg ggaaatgtgc gcggaacccc tatttgttta tttttctaaa     3840

tacattcaaa tatgtatccg ctcatgagac aataaccctg ataaatgctt caataatatt     3900

gaaaaaggaa gagtatgagt attcaacatt tccgtgtcgc ccttattccc ttttttgcgg     3960

cattttgcct tcctgttttt gctcacccag aaacgctggt gaaagtaaaa gatgctgaag     4020

atcagttggg tgcacgagtg ggttacatcg aactggatct caacagcggt aagatccttg     4080

agagttttcg ccccgaagaa cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg     4140

gcgcggtatt atcccgtatt gacgccgggc aagagcaact cggtcgccgc atacactatt     4200

ctcagaatga cttggttgag tactcaccag tcacagaaaa gcatcttacg gatggcatga     4260

cagtaagaga attatgcagt gctgccataa ccatgagtga taacactgcg gccaacttac     4320

ttctgacaac gatcggagga ccgaaggagc taaccgcttt tttgcacaac atgggggatc     4380

atgtaactcg ccttgatcgt tgggaaccgg agctgaatga agccatacca aacgacgagc     4440

gtgacaccac gatgcctgta gcaatggcaa caacgttgcg caaactatta actggcgaac     4500

tacttactct agcttcccgg caacaattaa tagactggat ggaggcggat aaagttgcag     4560

gaccacttct gcgctcggcc cttccggctg gctggtttat tgctgataaa tctggagccg     4620

gtgagcgtgg gtctcgcggt atcattgcag cactggggcc agatggtaag ccctcccgta     4680

tcgtagttat ctacacgacg gggagtcagg caactatgga tgaacgaaat agacagatcg     4740

ctgagatagg tgcctcactg attaagcatt ggtaactgtc agaccaagtt tactcatata     4800

tactttagat tgatttaaaa cttcattttt aatttaaaag gatctaggtg aagatccttt     4860

ttgataatct catgaccaaa atcccttaac gtgagttttc gttccactga gcgtcagacc     4920

ccgtagaaaa gatcaaagga tcttcttgag atcctttttt tctgcgcgta atctgctgct     4980

tgcaaacaaa aaaaccaccg ctaccagcgg tggtttgttt gccggatcaa gagctaccaa     5040

ctctttttcc gaaggtaact ggcttcagca gagcgcagat accaaatact gttcttctag     5100

tgtagccgta gttaggccac cacttcaaga actctgtagc accgcctaca tacctcgctc     5160

tgctaatcct gttaccagtg gctgctgcca gtggcgataa gtcgtgtctt accgggttgg     5220

actcaagacg atagttaccg gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca     5280

cacagcccag cttggagcga acgacctaca ccgaactgag atacctacag cgtgagctat     5340

gagaaagcgc cacgcttccc gaagggagaa aggcggacag gtatccggta agcggcaggg     5400

tcggaacagg agagcgcacg agggagcttc cagggggaaa cgcctggtat ctttatagtc     5460

ctgtcgggtt tcgccacctc tgacttgagc gtcgattttt gtgatgctcg tcaggggggc     5520

ggagcctatg gaaaaacgcc agcaacgcgg cctttttacg gttcctggcc ttttgctggc     5580

cttttgctca catgttcttt cctgcgttat cccctgattc tgtggataac cgtattaccg     5640

cctttgagtg agctgatacc gctcgccgca gccgaacgac cgagcgcagc gagtcagtga     5700

gcgaggaagc ggaagagcgc ccaatacgca aaccgcctct ccccgcgcgt tggccgattc     5760

attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag cgcaacgcaa     5820

ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg cttccggctc     5880

gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc tatgaccatg     5940

attacgccag atttaattaa gg                                              5962


<210>  11
<211>  738
<212>  PRT
<213>  Artificial Sequence

<220>
<223>  AAV8 VP1 capsid protein

<400>  11

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser 
1               5                   10                  15      


Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Lys Pro 
            20                  25                  30          


Lys Ala Asn Gln Gln Lys Gln Asp Asp Gly Arg Gly Leu Val Leu Pro 
        35                  40                  45              


Gly Tyr Lys Tyr Leu Gly Pro Phe Asn Gly Leu Asp Lys Gly Glu Pro 
    50                  55                  60                  


Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp 
65                  70                  75                  80  


Gln Gln Leu Gln Ala Gly Asp Asn Pro Tyr Leu Arg Tyr Asn His Ala 
                85                  90                  95      


Asp Ala Glu Phe Gln Glu Arg Leu Gln Glu Asp Thr Ser Phe Gly Gly 
            100                 105                 110         


Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Val Leu Glu Pro 
        115                 120                 125             


Leu Gly Leu Val Glu Glu Gly Ala Lys Thr Ala Pro Gly Lys Lys Arg 
    130                 135                 140                 


Pro Val Glu Pro Ser Pro Gln Arg Ser Pro Asp Ser Ser Thr Gly Ile 
145                 150                 155                 160 


Gly Lys Lys Gly Gln Gln Pro Ala Arg Lys Arg Leu Asn Phe Gly Gln 
                165                 170                 175     


Thr Gly Asp Ser Glu Ser Val Pro Asp Pro Gln Pro Leu Gly Glu Pro 
            180                 185                 190         


Pro Ala Ala Pro Ser Gly Val Gly Pro Asn Thr Met Ala Ala Gly Gly 
        195                 200                 205             


Gly Ala Pro Met Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser 
    210                 215                 220                 


Ser Ser Gly Asn Trp His Cys Asp Ser Thr Trp Leu Gly Asp Arg Val 
225                 230                 235                 240 


Ile Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His 
                245                 250                 255     


Leu Tyr Lys Gln Ile Ser Asn Gly Thr Ser Gly Gly Ala Thr Asn Asp 
            260                 265                 270         


Asn Thr Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn 
        275                 280                 285             


Arg Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn 
    290                 295                 300                 


Asn Asn Trp Gly Phe Arg Pro Lys Arg Leu Ser Phe Lys Leu Phe Asn 
305                 310                 315                 320 


Ile Gln Val Lys Glu Val Thr Gln Asn Glu Gly Thr Lys Thr Ile Ala 
                325                 330                 335     


Asn Asn Leu Thr Ser Thr Ile Gln Val Phe Thr Asp Ser Glu Tyr Gln 
            340                 345                 350         


Leu Pro Tyr Val Leu Gly Ser Ala His Gln Gly Cys Leu Pro Pro Phe 
        355                 360                 365             


Pro Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn 
    370                 375                 380                 


Asn Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr 
385                 390                 395                 400 


Phe Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Thr Tyr 
                405                 410                 415     


Thr Phe Glu Asp Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser 
            420                 425                 430         


Leu Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu 
        435                 440                 445             


Ser Arg Thr Gln Thr Thr Gly Gly Thr Ala Asn Thr Gln Thr Leu Gly 
    450                 455                 460                 


Phe Ser Gln Gly Gly Pro Asn Thr Met Ala Asn Gln Ala Lys Asn Trp 
465                 470                 475                 480 


Leu Pro Gly Pro Cys Tyr Arg Gln Gln Arg Val Ser Thr Thr Thr Gly 
                485                 490                 495     


Gln Asn Asn Asn Ser Asn Phe Ala Trp Thr Ala Gly Thr Lys Tyr His 
            500                 505                 510         


Leu Asn Gly Arg Asn Ser Leu Ala Asn Pro Gly Ile Ala Met Ala Thr 
        515                 520                 525             


His Lys Asp Asp Glu Glu Arg Phe Phe Pro Ser Asn Gly Ile Leu Ile 
    530                 535                 540                 


Phe Gly Lys Gln Asn Ala Ala Arg Asp Asn Ala Asp Tyr Ser Asp Val 
545                 550                 555                 560 


Met Leu Thr Ser Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr 
                565                 570                 575     


Glu Glu Tyr Gly Ile Val Ala Asp Asn Leu Gln Gln Gln Asn Thr Ala 
            580                 585                 590         


Pro Gln Ile Gly Thr Val Asn Ser Gln Gly Ala Leu Pro Gly Met Val 
        595                 600                 605             


Trp Gln Asn Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile 
    610                 615                 620                 


Pro His Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe 
625                 630                 635                 640 


Gly Leu Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val 
                645                 650                 655     


Pro Ala Asp Pro Pro Thr Thr Phe Asn Gln Ser Lys Leu Asn Ser Phe 
            660                 665                 670         


Ile Thr Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu 
        675                 680                 685             


Leu Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr 
    690                 695                 700                 


Ser Asn Tyr Tyr Lys Ser Thr Ser Val Asp Phe Ala Val Asn Thr Glu 
705                 710                 715                 720 


Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg 
                725                 730                 735     


Asn Leu 
        


<210>  12
<211>  1599
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  engineered UGT1A1 U201DPmod

<400>  12
atggccgtgg agagccaggg aggacggcct ctggtgctgg gactgctgct gtgcgtgctg       60

ggacctgtgg tgagccacgc cggaaagatc ctgctgatcc ctgtggacgg aagccactgg      120

ctgagcatgc tgggagccat ccagcagctg cagcagcggg gacacgagat cgtggtgctg      180

gcccctgacg ccagcctgta catccgggac ggagccttct acaccctgaa gacctaccct      240

gtgcctttcc agcgggagga cgtgaaggag agcttcgtga gcctgggaca caacgtgttc      300

gagaacgata gcttcctgca gcgggtgatc aagacctaca agaagatcaa gaaggacagc      360

gccatgctgc tgagcggctg cagccacctg ctgcacaaca aggagctgat ggccagcctg      420

gccgagagca gcttcgacgt gatgctgacc gaccctttcc tgccttgcag ccctatcgtg      480

gcccagtacc tgagcctgcc taccgtgttc ttcctgcacg ccctgccttg cagcctggag      540

ttcgaggcca cccagtgccc taaccctttc agctacgtgc ctcggcctct gagcagccac      600

agcgaccaca tgaccttcct gcagcgggtg aagaacatgc tgatcgcctt cagccagaac      660

ttcctgtgcg acgtggtgta cagcccttac gccaccctgg ccagcgagtt cctgcagcgg      720

gaggtgaccg tgcaggacct gctgagcagc gccagcgtgt ggctgttccg gagcgacttc      780

gtgaaggact accctcggcc tatcatgcct aacatggtgt tcgtgggagg aatcaactgc      840

ctgcaccaga accctctgag ccaggagttc gaggcctaca tcaacgccag cggagagcac      900

ggaatcgtgg tgttcagcct gggaagcatg gtgagcgaga tccctgagaa gaaggccatg      960

gccatcgccg acgccctggg aaagatccct cagaccgtgc tgtggcggta caccggaacc     1020

cggcctagca acctggccaa caacaccatc ctggtgaagt ggctgcctca gaacgatctg     1080

ctgggacacc ctatgacccg ggccttcatc acccacgccg gaagccacgg agtgtacgag     1140

agcatctgca acggagtgcc tatggtgatg atgcctctgt tcggagacca gatggacaac     1200

gccaagcgga tggagaccaa gggagccgga gtgaccctga acgtgctgga gatgaccagc     1260

gaggacctgg agaacgccct gaaggccgtg atcaacgata agagctacaa ggagaacatc     1320

atgcggctga gcagcctgca caaggaccgg cctgtggagc ctctggacct ggccgtgttc     1380

tgggtggagt tcgtgatgcg gcacaaggga gcccctcacc tgcggcctgc cgcccacgac     1440

ctgacctggt accagtacca cagcctggac gtgatcggat tcctgctggc cgtggtgctg     1500

accgtggcct tcatcacctt caagtgctgc gcctacggat accggaagtg cctgggaaag     1560

aagggacggg tgaagaaggc ccacaagagc aagacccac                            1599


<210>  13
<211>  1599
<212>  DNA
<213>  Artificial sequence

<220>
<223>  engineered UGT1A1 U001mod

<400>  13
atggccgtgg agagccaggg ggggcggccc ctggtgctgg ggctgctgct gtgcgtgctg       60

gggcccgtgg tgagccacgc cgggaagatc ctgctgatcc ccgtggacgg gagccactgg      120

ctgagcatgc tgggggccat ccagcagctg cagcagcggg ggcacgagat cgtggtgctg      180

gcccccgacg ccagcctgta catccgggac ggggccttct acaccctgaa gacctacccc      240

gtgcccttcc agcgggagga cgtgaaggag agcttcgtga gcctggggca caacgtgttc      300

gagaacgata gcttcctgca gcgggtgatc aagacctaca agaagatcaa gaaggacagc      360

gccatgctgc tgagcgggtg cagccacctg ctgcacaaca aggagctgat ggccagcctg      420

gccgagagca gcttcgacgt gatgctgacc gaccccttcc tgccctgcag ccccatcgtg      480

gcccagtacc tgagcctgcc caccgtgttc ttcctgcacg ccctgccctg cagcctggag      540

ttcgaggcca cccagtgccc caaccccttc agctacgtgc cccggcccct gagcagccac      600

agcgaccaca tgaccttcct gcagcgggtg aagaacatgc tgatcgcctt cagccagaac      660

ttcctgtgcg acgtggtgta cagcccctac gccaccctgg ccagcgagtt cctgcagcgg      720

gaggtgaccg tgcaggacct gctgagcagc gccagcgtgt ggctgttccg gagcgacttc      780

gtgaaggact acccccggcc catcatgccc aacatggtgt tcgtgggggg gatcaactgc      840

ctgcaccaga accccctgag ccaggagttc gaggcctaca tcaacgccag cggggagcac      900

gggatcgtgg tgttcagcct ggggagcatg gtgagcgaga tccccgagaa gaaggccatg      960

gccatcgccg acgccctggg gaagatcccc cagaccgtgc tgtggcggta caccgggacc     1020

cggcccagca acctggccaa caacaccatc ctggtgaagt ggctgcccca gaacgatctg     1080

ctggggcacc ccatgacccg ggccttcatc acccacgccg ggagccacgg ggtgtacgag     1140

agcatctgca acggggtgcc catggtgatg atgcccctgt tcggggacca gatggacaac     1200

gccaagcgga tggagaccaa gggggccggg gtgaccctga acgtgctgga gatgaccagc     1260

gaggacctgg agaacgccct gaaggccgtg atcaacgata agagctacaa ggagaacatc     1320

atgcggctga gcagcctgca caaggaccgg cccgtggagc ccctggacct ggccgtgttc     1380

tgggtggagt tcgtgatgcg gcacaagggg gccccccacc tgcggcccgc cgcccacgac     1440

ctgacctggt accagtacca cagcctggac gtgatcgggt tcctgctggc cgtggtgctg     1500

accgtggcct tcatcacctt caagtgctgc gcctacgggt accggaagtg cctggggaag     1560

aaggggcggg tgaagaaggc ccacaagagc aagacccac                            1599


<210>  14
<211>  1599
<212>  DNA
<213>  Artificial sequence

<220>
<223>  engineered UGT1A1 U011TYmod

<400>  14
atggctgtgg aaagccaggg cggccggccc ctggtgctgg gcctgctgct gtgtgtgctg       60

ggccccgtgg tgagccacgc tggcaagatt ctgctgattc ccgtggacgg cagccactgg      120

ctgagcatgc tgggcgctat tcagcagctg cagcagcggg gccacgaaat tgtggtgctg      180

gctcccgacg ctagcctgta cattcgggac ggcgcttttt acaccctgaa gacctacccc      240

gtgccctttc agcgggaaga cgtgaaggaa agctttgtga gcctgggcca caacgtgttt      300

gaaaacgata gctttctgca gcgggtgatt aagacctaca agaagattaa gaaggacagc      360

gctatgctgc tgagcggctg tagccacctg ctgcacaaca aggaactgat ggctagcctg      420

gctgaaagca gctttgacgt gatgctgacc gacccctttc tgccctgtag ccccattgtg      480

gctcagtacc tgagcctgcc caccgtgttt tttctgcacg ctctgccctg tagcctggaa      540

tttgaagcta cccagtgtcc caaccccttt agctacgtgc cccggcccct gagcagccac      600

agcgaccaca tgacctttct gcagcgggtg aagaacatgc tgattgcttt tagccagaac      660

tttctgtgtg acgtggtgta cagcccctac gctaccctgg ctagcgaatt tctgcagcgg      720

gaagtgaccg tgcaggacct gctgagcagc gctagcgtgt ggctgtttcg gagcgacttt      780

gtgaaggact acccccggcc cattatgccc aacatggtgt ttgtgggcgg cattaactgt      840

ctgcaccaga accccctgag ccaggaattt gaagcttaca ttaacgctag cggcgaacac      900

ggcattgtgg tgtttagcct gggcagcatg gtgagcgaaa ttcccgaaaa gaaggctatg      960

gctattgctg acgctctggg caagattccc cagaccgtgc tgtggcggta caccggcacc     1020

cggcccagca acctggctaa caacaccatt ctggtgaagt ggctgcccca gaacgatctg     1080

ctgggccacc ccatgacccg ggcttttatt acccacgctg gcagccacgg cgtgtacgaa     1140

agcatttgta acggcgtgcc catggtgatg atgcccctgt ttggcgacca gatggacaac     1200

gctaagcgga tggaaaccaa gggcgctggc gtgaccctga acgtgctgga aatgaccagc     1260

gaagacctgg aaaacgctct gaaggctgtg attaacgata agagctacaa ggaaaacatt     1320

atgcggctga gcagcctgca caaggaccgg cccgtggaac ccctggacct ggctgtgttt     1380

tgggtggaat ttgtgatgcg gcacaagggc gctccccacc tgcggcccgc tgctcacgac     1440

ctgacctggt accagtacca cagcctggac gtgattggct ttctgctggc tgtggtgctg     1500

accgtggctt ttattacctt taagtgttgt gcttacggct accggaagtg tctgggcaag     1560

aagggccggg tgaagaaggc tcacaagagc aagacccac                            1599


<210>  15
<211>  3140
<212>  DNA
<213>  Artificial sequence

<220>
<223>  pAAV.TGB.U201DP.BGH (p4120)


<220>
<221>  repeat_region
<222>  (1)..(168)
<223>  5' ITR

<220>
<221>  enhancer
<222>  (211)..(310)
<223>  alpha mic/bik enhancer

<220>
<221>  enhancer
<222>  (317)..(416)
<223>  alpha mic/bik enhancer

<220>
<221>  promoter
<222>  (431)..(907)
<223>  TBG promoter

<220>
<221>  Intron
<222>  (939)..(1071)
<223>  SV40 misc intron (Promega)

<220>
<221>  misc_feature
<222>  (1092)..(2690)
<223>  U201DPmod CDS

<220>
<221>  polyA_signal
<222>  (2709)..(2923)
<223>  BGH polyA

<220>
<221>  repeat_region
<222>  (2973)..(3140)
<223>  AAV 3' ITR

<400>  15
ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt       60

ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact      120

aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct      180

aggaagatcg gaattcgccc ttaagctagc aggttaattt ttaaaaagca gtcaaaagtc      240

caagtggccc ttggcagcat ttactctctc tgtttgctct ggttaataat ctcaggagca      300

caaacattcc agatccaggt taatttttaa aaagcagtca aaagtccaag tggcccttgg      360

cagcatttac tctctctgtt tgctctggtt aataatctca ggagcacaaa cattccagat      420

ccggcgcgcc agggctggaa gctacctttg acatcatttc ctctgcgaat gcatgtataa      480

tttctacaga acctattaga aaggatcacc cagcctctgc ttttgtacaa ctttccctta      540

aaaaactgcc aattccactg ctgtttggcc caatagtgag aactttttcc tgctgcctct      600

tggtgctttt gcctatggcc cctattctgc ctgctgaaga cactcttgcc agcatggact      660

taaacccctc cagctctgac aatcctcttt ctcttttgtt ttacatgaag ggtctggcag      720

ccaaagcaat cactcaaagt tcaaacctta tcattttttg ctttgttcct cttggccttg      780

gttttgtaca tcagctttga aaataccatc ccagggttaa tgctggggtt aatttataac      840

taagagtgct ctagttttgc aatacaggac atgctataaa aatggaaaga tgttgctttc      900

tgagagactg cagaagttgg tcgtgaggca ctgggcaggt aagtatcaag gttacaagac      960

aggtttaagg agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct     1020

gataggcacc tattggtctt actgacatcc actttgcctt tctctccaca ggtgtccagg     1080

cggccgccac catggccgtg gagagccagg gaggacggcc tctggtgctg ggactgctgc     1140

tgtgcgtgct gggacctgtg gtgagccacg ccggaaagat cctgctgatc cctgtggacg     1200

gaagccactg gctgagcatg ctgggagcca tccagcagct gcagcagcgg ggacacgaga     1260

tcgtggtgct ggcccctgac gccagcctgt acatccggga cggagccttc tacaccctga     1320

agacctaccc tgtgcctttc cagcgggagg acgtgaagga gagcttcgtg agcctgggac     1380

acaacgtgtt cgagaacgat agcttcctgc agcgggtgat caagacctac aagaagatca     1440

agaaggacag cgccatgctg ctgagcggct gcagccacct gctgcacaac aaggagctga     1500

tggccagcct ggccgagagc agcttcgacg tgatgctgac cgaccctttc ctgccttgca     1560

gccctatcgt ggcccagtac ctgagcctgc ctaccgtgtt cttcctgcac gccctgcctt     1620

gcagcctgga gttcgaggcc acccagtgcc ctaacccttt cagctacgtg cctcggcctc     1680

tgagcagcca cagcgaccac atgaccttcc tgcagcgggt gaagaacatg ctgatcgcct     1740

tcagccagaa cttcctgtgc gacgtggtgt acagccctta cgccaccctg gccagcgagt     1800

tcctgcagcg ggaggtgacc gtgcaggacc tgctgagcag cgccagcgtg tggctgttcc     1860

ggagcgactt cgtgaaggac taccctcggc ctatcatgcc taacatggtg ttcgtgggag     1920

gaatcaactg cctgcaccag aaccctctga gccaggagtt cgaggcctac atcaacgcca     1980

gcggagagca cggaatcgtg gtgttcagcc tgggaagcat ggtgagcgag atccctgaga     2040

agaaggccat ggccatcgcc gacgccctgg gaaagatccc tcagaccgtg ctgtggcggt     2100

acaccggaac ccggcctagc aacctggcca acaacaccat cctggtgaag tggctgcctc     2160

agaacgatct gctgggacac cctatgaccc gggccttcat cacccacgcc ggaagccacg     2220

gagtgtacga gagcatctgc aacggagtgc ctatggtgat gatgcctctg ttcggagacc     2280

agatggacaa cgccaagcgg atggagacca agggagccgg agtgaccctg aacgtgctgg     2340

agatgaccag cgaggacctg gagaacgccc tgaaggccgt gatcaacgat aagagctaca     2400

aggagaacat catgcggctg agcagcctgc acaaggaccg gcctgtggag cctctggacc     2460

tggccgtgtt ctgggtggag ttcgtgatgc ggcacaaggg agcccctcac ctgcggcctg     2520

ccgcccacga cctgacctgg taccagtacc acagcctgga cgtgatcgga ttcctgctgg     2580

ccgtggtgct gaccgtggcc ttcatcacct tcaagtgctg cgcctacgga taccggaagt     2640

gcctgggaaa gaagggacgg gtgaagaagg cccacaagag caagacccac tgataaggat     2700

ccagatctgc ctcgactgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg     2760

tgccttcctt gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa     2820

ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca     2880

gcaaggggga ggattgggaa gacaatagca ggcatgctgg ggactcgagt taagggcgaa     2940

ttcccgatta ggatcttcct agagcatggc tacgtagata agtagcatgg cgggttaatc     3000

attaactaca aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg     3060

ctcactgagg ccgggcgacc aaaggtcgcc cgacgcccgg gctttgcccg ggcggcctca     3120

gtgagcgagc gagcgcgcag                                                 3140


<210>  16
<211>  3140
<212>  DNA
<213>  Artificial sequence

<220>
<223>  pAAV.TBG.U011TY.BGH (p4119)


<220>
<221>  repeat_region
<222>  (1)..(168)
<223>  AAV 5' ITR

<220>
<221>  enhancer
<222>  (211)..(310)
<223>  alpha mic/bik

<220>
<221>  enhancer
<222>  (317)..(416)
<223>  alpha mic/bik

<220>
<221>  promoter
<222>  (431)..(907)
<223>  TBG promoter

<220>
<221>  Intron
<222>  (939)..(1071)
<223>  SV40 misc intron (Promega)

<220>
<221>  misc_feature
<222>  (1092)..(2690)
<223>  CDS for modfied U011TY

<220>
<221>  polyA
<222>  (2709)..(2923)
<223>  BGH polyA

<220>
<221>  repeat_region
<222>  (2973)..(3140)
<223>  AAV 3'ITR

<400>  16
ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt       60

ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact      120

aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct      180

aggaagatcg gaattcgccc ttaagctagc aggttaattt ttaaaaagca gtcaaaagtc      240

caagtggccc ttggcagcat ttactctctc tgtttgctct ggttaataat ctcaggagca      300

caaacattcc agatccaggt taatttttaa aaagcagtca aaagtccaag tggcccttgg      360

cagcatttac tctctctgtt tgctctggtt aataatctca ggagcacaaa cattccagat      420

ccggcgcgcc agggctggaa gctacctttg acatcatttc ctctgcgaat gcatgtataa      480

tttctacaga acctattaga aaggatcacc cagcctctgc ttttgtacaa ctttccctta      540

aaaaactgcc aattccactg ctgtttggcc caatagtgag aactttttcc tgctgcctct      600

tggtgctttt gcctatggcc cctattctgc ctgctgaaga cactcttgcc agcatggact      660

taaacccctc cagctctgac aatcctcttt ctcttttgtt ttacatgaag ggtctggcag      720

ccaaagcaat cactcaaagt tcaaacctta tcattttttg ctttgttcct cttggccttg      780

gttttgtaca tcagctttga aaataccatc ccagggttaa tgctggggtt aatttataac      840

taagagtgct ctagttttgc aatacaggac atgctataaa aatggaaaga tgttgctttc      900

tgagagactg cagaagttgg tcgtgaggca ctgggcaggt aagtatcaag gttacaagac      960

aggtttaagg agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct     1020

gataggcacc tattggtctt actgacatcc actttgcctt tctctccaca ggtgtccagg     1080

cggccgccac catggctgtg gaaagccagg gcggccggcc cctggtgctg ggcctgctgc     1140

tgtgtgtgct gggccccgtg gtgagccacg ctggcaagat tctgctgatt cccgtggacg     1200

gcagccactg gctgagcatg ctgggcgcta ttcagcagct gcagcagcgg ggccacgaaa     1260

ttgtggtgct ggctcccgac gctagcctgt acattcggga cggcgctttt tacaccctga     1320

agacctaccc cgtgcccttt cagcgggaag acgtgaagga aagctttgtg agcctgggcc     1380

acaacgtgtt tgaaaacgat agctttctgc agcgggtgat taagacctac aagaagatta     1440

agaaggacag cgctatgctg ctgagcggct gtagccacct gctgcacaac aaggaactga     1500

tggctagcct ggctgaaagc agctttgacg tgatgctgac cgaccccttt ctgccctgta     1560

gccccattgt ggctcagtac ctgagcctgc ccaccgtgtt ttttctgcac gctctgccct     1620

gtagcctgga atttgaagct acccagtgtc ccaacccctt tagctacgtg ccccggcccc     1680

tgagcagcca cagcgaccac atgacctttc tgcagcgggt gaagaacatg ctgattgctt     1740

ttagccagaa ctttctgtgt gacgtggtgt acagccccta cgctaccctg gctagcgaat     1800

ttctgcagcg ggaagtgacc gtgcaggacc tgctgagcag cgctagcgtg tggctgtttc     1860

ggagcgactt tgtgaaggac tacccccggc ccattatgcc caacatggtg tttgtgggcg     1920

gcattaactg tctgcaccag aaccccctga gccaggaatt tgaagcttac attaacgcta     1980

gcggcgaaca cggcattgtg gtgtttagcc tgggcagcat ggtgagcgaa attcccgaaa     2040

agaaggctat ggctattgct gacgctctgg gcaagattcc ccagaccgtg ctgtggcggt     2100

acaccggcac ccggcccagc aacctggcta acaacaccat tctggtgaag tggctgcccc     2160

agaacgatct gctgggccac cccatgaccc gggcttttat tacccacgct ggcagccacg     2220

gcgtgtacga aagcatttgt aacggcgtgc ccatggtgat gatgcccctg tttggcgacc     2280

agatggacaa cgctaagcgg atggaaacca agggcgctgg cgtgaccctg aacgtgctgg     2340

aaatgaccag cgaagacctg gaaaacgctc tgaaggctgt gattaacgat aagagctaca     2400

aggaaaacat tatgcggctg agcagcctgc acaaggaccg gcccgtggaa cccctggacc     2460

tggctgtgtt ttgggtggaa tttgtgatgc ggcacaaggg cgctccccac ctgcggcccg     2520

ctgctcacga cctgacctgg taccagtacc acagcctgga cgtgattggc tttctgctgg     2580

ctgtggtgct gaccgtggct tttattacct ttaagtgttg tgcttacggc taccggaagt     2640

gtctgggcaa gaagggccgg gtgaagaagg ctcacaagag caagacccac tgataaggat     2700

ccagatctgc ctcgactgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg     2760

tgccttcctt gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa     2820

ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca     2880

gcaaggggga ggattgggaa gacaatagca ggcatgctgg ggactcgagt taagggcgaa     2940

ttcccgatta ggatcttcct agagcatggc tacgtagata agtagcatgg cgggttaatc     3000

attaactaca aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg     3060

ctcactgagg ccgggcgacc aaaggtcgcc cgacgcccgg gctttgcccg ggcggcctca     3120

gtgagcgagc gagcgcgcag                                                 3140


<210>  17
<211>  3140
<212>  DNA
<213>  Artificial sequence

<220>
<223>  pAAV.TBG.U001.BGH (p4118)

<400>  17
ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg ggcgaccttt       60

ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa ctccatcact      120

aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta gccatgctct      180

aggaagatcg gaattcgccc ttaagctagc aggttaattt ttaaaaagca gtcaaaagtc      240

caagtggccc ttggcagcat ttactctctc tgtttgctct ggttaataat ctcaggagca      300

caaacattcc agatccaggt taatttttaa aaagcagtca aaagtccaag tggcccttgg      360

cagcatttac tctctctgtt tgctctggtt aataatctca ggagcacaaa cattccagat      420

ccggcgcgcc agggctggaa gctacctttg acatcatttc ctctgcgaat gcatgtataa      480

tttctacaga acctattaga aaggatcacc cagcctctgc ttttgtacaa ctttccctta      540

aaaaactgcc aattccactg ctgtttggcc caatagtgag aactttttcc tgctgcctct      600

tggtgctttt gcctatggcc cctattctgc ctgctgaaga cactcttgcc agcatggact      660

taaacccctc cagctctgac aatcctcttt ctcttttgtt ttacatgaag ggtctggcag      720

ccaaagcaat cactcaaagt tcaaacctta tcattttttg ctttgttcct cttggccttg      780

gttttgtaca tcagctttga aaataccatc ccagggttaa tgctggggtt aatttataac      840

taagagtgct ctagttttgc aatacaggac atgctataaa aatggaaaga tgttgctttc      900

tgagagactg cagaagttgg tcgtgaggca ctgggcaggt aagtatcaag gttacaagac      960

aggtttaagg agaccaatag aaactgggct tgtcgagaca gagaagactc ttgcgtttct     1020

gataggcacc tattggtctt actgacatcc actttgcctt tctctccaca ggtgtccagg     1080

cggccgccac catggccgtg gagagccagg gggggcggcc cctggtgctg gggctgctgc     1140

tgtgcgtgct ggggcccgtg gtgagccacg ccgggaagat cctgctgatc cccgtggacg     1200

ggagccactg gctgagcatg ctgggggcca tccagcagct gcagcagcgg gggcacgaga     1260

tcgtggtgct ggcccccgac gccagcctgt acatccggga cggggccttc tacaccctga     1320

agacctaccc cgtgcccttc cagcgggagg acgtgaagga gagcttcgtg agcctggggc     1380

acaacgtgtt cgagaacgat agcttcctgc agcgggtgat caagacctac aagaagatca     1440

agaaggacag cgccatgctg ctgagcgggt gcagccacct gctgcacaac aaggagctga     1500

tggccagcct ggccgagagc agcttcgacg tgatgctgac cgaccccttc ctgccctgca     1560

gccccatcgt ggcccagtac ctgagcctgc ccaccgtgtt cttcctgcac gccctgccct     1620

gcagcctgga gttcgaggcc acccagtgcc ccaacccctt cagctacgtg ccccggcccc     1680

tgagcagcca cagcgaccac atgaccttcc tgcagcgggt gaagaacatg ctgatcgcct     1740

tcagccagaa cttcctgtgc gacgtggtgt acagccccta cgccaccctg gccagcgagt     1800

tcctgcagcg ggaggtgacc gtgcaggacc tgctgagcag cgccagcgtg tggctgttcc     1860

ggagcgactt cgtgaaggac tacccccggc ccatcatgcc caacatggtg ttcgtggggg     1920

ggatcaactg cctgcaccag aaccccctga gccaggagtt cgaggcctac atcaacgcca     1980

gcggggagca cgggatcgtg gtgttcagcc tggggagcat ggtgagcgag atccccgaga     2040

agaaggccat ggccatcgcc gacgccctgg ggaagatccc ccagaccgtg ctgtggcggt     2100

acaccgggac ccggcccagc aacctggcca acaacaccat cctggtgaag tggctgcccc     2160

agaacgatct gctggggcac cccatgaccc gggccttcat cacccacgcc gggagccacg     2220

gggtgtacga gagcatctgc aacggggtgc ccatggtgat gatgcccctg ttcggggacc     2280

agatggacaa cgccaagcgg atggagacca agggggccgg ggtgaccctg aacgtgctgg     2340

agatgaccag cgaggacctg gagaacgccc tgaaggccgt gatcaacgat aagagctaca     2400

aggagaacat catgcggctg agcagcctgc acaaggaccg gcccgtggag cccctggacc     2460

tggccgtgtt ctgggtggag ttcgtgatgc ggcacaaggg ggccccccac ctgcggcccg     2520

ccgcccacga cctgacctgg taccagtacc acagcctgga cgtgatcggg ttcctgctgg     2580

ccgtggtgct gaccgtggcc ttcatcacct tcaagtgctg cgcctacggg taccggaagt     2640

gcctggggaa gaaggggcgg gtgaagaagg cccacaagag caagacccac tgataaggat     2700

ccagatctgc ctcgactgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg     2760

tgccttcctt gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa     2820

ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca     2880

gcaaggggga ggattgggaa gacaatagca ggcatgctgg ggactcgagt taagggcgaa     2940

ttcccgatta ggatcttcct agagcatggc tacgtagata agtagcatgg cgggttaatc     3000

attaactaca aggaacccct agtgatggag ttggccactc cctctctgcg cgctcgctcg     3060

ctcactgagg ccgggcgacc aaaggtcgcc cgacgcccgg gctttgcccg ggcggcctca     3120

gtgagcgagc gagcgcgcag                                                 3140


<210>  18
<211>  1599
<212>  DNA
<213>  Artificial Sequence

<220>
<223>  engineered UG1A1 U3G

<400>  18
atggctgtgg agagccaggg gggcaggccc ctggtgctgg gcctgctgct gtgtgtgctg       60

ggccctgtgg tgagccatgc tggcaagatc ctgctgatcc ctgtggatgg cagccactgg      120

ctgagcatgc tgggggccat ccagcagctg cagcagaggg gccatgagat tgtggtgctg      180

gcccctgatg ccagcctgta catcagggat ggggccttct acaccctgaa gacctaccct      240

gtgcccttcc agagggagga tgtgaaggag agctttgtga gcctgggcca caatgtgttt      300

gagaatgaca gcttcctgca gagggtgatc aagacctaca agaagatcaa gaaggactct      360

gccatgctgc tgtctggctg cagccacctg ctgcacaaca aggagctgat ggccagcctg      420

gctgagagca gctttgatgt gatgctgact gaccccttcc tgccctgcag ccccattgtg      480

gcccagtacc tgagcctgcc cactgtgttc ttcctgcatg ccctgccctg cagcctggag      540

tttgaggcca cccagtgccc caaccccttc agctatgtgc ccaggcccct gagcagccac      600

tctgaccaca tgaccttcct gcagagggtg aagaacatgc tgattgcctt cagccagaac      660

ttcctgtgtg atgtggtgta cagcccctat gccaccctgg cctctgagtt cctgcagagg      720

gaggtgactg tgcaggacct gctgagctct gcctctgtgt ggctgttcag gtctgacttt      780

gtgaaggact accccaggcc catcatgccc aacatggtgt ttgtgggggg catcaactgc      840

ctgcaccaga accccctgag ccaggagttt gaggcctaca tcaatgcctc tggggagcat      900

ggcatagtgg tgttcagcct gggcagcatg gtgtctgaga tccctgagaa gaaggccatg      960

gccattgctg atgccctagg caagatcccc cagactgtgc tgtggaggta cactggcacc     1020

aggcccagca acctggccaa caacaccatc ctggtgaagt ggctgcccca gaatgacctg     1080

ctgggccacc ccatgaccag ggccttcatc acccatgctg gcagccatgg ggtgtatgag     1140

agcatctgca atggggtgcc catggtgatg atgcccctgt ttggggacca gatggacaat     1200

gccaagagga tggaaaccaa gggggctggg gtgaccctga atgtgctgga gatgacctct     1260

gaggacctgg agaatgccct gaaggctgtg atcaatgaca agagctacaa ggagaacatc     1320

atgaggctga gcagcctgca caaggacagg cctgtggagc ccctggacct ggctgtgttc     1380

tgggtggagt ttgtgatgag gcataagggg gccccccacc tgaggcctgc tgcccatgac     1440

ctgacctggt accagtacca cagcctagat gtgattggct tcctgctggc tgtggtgctg     1500

actgtggcct tcatcacctt taagtgctgt gcctatggct acaggaagtg cctgggcaag     1560

aagggcaggg tgaagaaggc ccacaagagc aagacccac                            1599


