Suppose you are constructing a human genomic library in BAC vectors where the human DNA fragments are on average 100,000 bp.

a. What is the minimum number of different recombinant BACs you need to construct in order to have

a greater than zero chance of having a complete

library—meaning one in which the entire genome

is represented?

The simple statistical equation that follows allows you

to determine the size that a genomic library needs to

be (that is, the number of independent recombinant

clones you need to make) for a given likelihood that

the entire genome is represented in the library.

N = ln (1 − P)

ln (1 − f )

In the equation, N is the number of independent recombinant clones; P is the probability that any particular part

of the genome is represented at least one time; f is thefraction of the genome in a single recombinant clone.

(Note: ln is the natural log, sometimes written as loge.)

b. Calculate f for the genomic library described in part (a).

c. How many different recombinant BAC clones

would you need to have a 99% chance that a specific 100,000 bp region of the genome is represented? How many clones for a 99.9% chance?

d. How many genomic equivalents correspond to each

of your answers in part (c)?

e. Suppose that after you ligated the human DNA

inserts with the BAC vectors and transformed

E. coli with the mixture, you find that you have

only 30,000 drug-resistant colonies transformed

with recombinant plasmids. What is the chance

that any specific 100,000 bp region of the genome

is represented in a recombinant plasmid?

f. If you want to construct a complete human genomic library that contains the smallest number of

independent recombinant clones possible, what is

the key variable that you should adjust?

One difficulty in molecular cloning using plasmid vectors is

that the restriction enzyme-digested vector can be resealed

by DNA ligase without an insert of genomic DNA. The next

two problems investigate methods to deal with this issue.

