THE CLOUD-IN-CLOUD PROBLEM IN THE PRESS-SCHECHTER FORMALISM

OF HIERARCHICAL STRUCTURE FORMATION

Karsten Jedamzik

Physics Research Program

Institute for Geophysics and Planetary Physics

University of California

Lawrence Livermore National Laboratory

Livermore, CA 94550

Abstract

The formalism by Press and Schechter (PS) is often used to infer number densities of virialized objects of mass (e.g. quasars, galaxies, clusters of galaxies, etc.) from a count of initially overdense regions in a Gaussian density perturbation field. We reanalyze the PS-formalism by explicitly counting underdense regions which are embedded within overdense regions, so called cloud-in-clouds. In contrast to the original PS-formalism, our revised analysis automatically accounts for all the cosmic material. We find that mass distribution functions for virialized objects are altered by the proper solution of the cloud-in-cloud problem. These altered distribution functions agree much better with distribution functions inferred from N-body simulations than the original PS-distribution functions.

Subject headings: cosmology - theory, large-scale structure of the universe

1. Introduction

In our current understanding the presently observed structure of the universe has formed by the gravitational growth of small-amplitude density perturbations extant at the epoch of matter-radiation equality at high redshift (). It is well known that a detailed comparison between quantities of the observed structure (e.g. spatial and angular correlation functions, the peculiar velocity field, the number densities of quasars, damped Lyman- clouds, and galaxy clusters) and predictions of these quantities in specific galaxy formation scenarios can yield valuable information about the matter content of the universe and the nature of the initial density perturbations. For a reliable prediction of the structure formed in specific cosmic scenarios one has, in principle, to resort to detailed N-body simulations. In practice, however, such N-body simulations may be rather time-consuming and can currently only resolve a limited dynamic range of mass.

A complimentary and analytic approach to the problem of structure formation has been developed by Press & Schechter 1974 (hereafter; PS). In their analysis the number densities and masses of virialized objects is directly inferred from the statistics of the initial density perturbation field in a manner to be illustrated in the following section. The PS-prediction for the fraction of cosmic matter contained in virialized and separate halos of mass , so called multiplicity- or mass- functions, is in fairly good agreement with results from N-body calculations indicating the general validity of the PS-approach. The PS-mass functions have been applied to predict the number densities of galaxies (Cole & Kaiser 1989; White & Frenk 1991; Kauffmann, White, & Guiderdoni 1993), clusters of galaxies (Cole & Kaiser 1988; Zhan 1990; Bartlett & Silk 1993), quasars (Carlberg 1990; Nusser & Silk 1993), Lyman- clouds (Kauffmann & Charlot 1994; Mo & Miralda-Escudé 1994), and dark halos (Narayan & White 1988) which emerge in different structure formation scenarios and at different redshifts. Recently the analysis by PS has also been extended to estimate the frequency of halo mergers and to predict details about the substructure of clusters of halos (Bower 1991; Bond et al. 1991; Kauffmann & White 1993; Lacey & Cole 1993; 1994).

It has been noted, however, that the original analysis by PS is strictly speaking incorrect since it only associates half of the mass of the initial density perturbation field with eventually to be virialized and gravitationally self-bound structures. PS achieved proper normalization of the mass functions by simply multiplying results by a factor of two. The remaining half of cosmic material, which is not automatically accounted for in the PS-formalism, is material initially present in underdense regions. It is believed that this remaining half of material is accounted for if a proper solution to the cloud-in-cloud problem is found, in particular the problem of correctly counting underdense regions which are embedded within overdense regions. The cloud-in-cloud problem has also plagued other analytic approaches to the subject of structure formation, such as the peaks formalism (Peacock & Heavens 1985; Bardeen et al. 1986).

Bond et al. (1991) proposed a solution to the cloud-in-cloud problem by considering excursion set mass functions. They were able to obtain analytic results for the mass functions if they applied sharp -space filtering. In sharp -space filtering average densities within a given volume are computed only from those Fourier modes of the density distribution which have wavelength larger than the characteristic dimension of the volume (refer to the next section). Their analysis accounts for all the cosmic material. Surprisingly, their analysis recovers the original PS-mass functions including the normalization factor of two. A similar result has been obtained by Peacock and Heavens (1990). When different filtering functions are considered (e.g. top hat or Gaussian filters in coordinate space) the shapes of mass functions may be altered.

In this paper we reanalyze the cloud-in-cloud problem in the context of the PS-formalism. The method outlined is an extension of the original PS-analysis which reduces to the PS-prescription when the cloud-in-cloud problem is (incorrectly) ignored. We find that a proper treatment of the cloud-in-cloud problem changes the shapes of mass functions, in contrast to the conclusions of Bond et al. (1991) and Peacock & Heavens (1990). We show that the renormalized mass functions are in much better agreement with results from N-body simulations than the original PS mass functions.

2. The cloud-in-cloud problem and mass distribution functions

It is well known that length and amplitude of small-amplitude density perturbations (i.e. amplitude ; where is the mass density of the perturbation and is the average cosmic mass density) grow proportionally to the cosmic scale factor during a matter-dominated epoch. When reaching non-linearity () perturbations recollapse and virialize. (cf. Kolb & Turner 1989). In the absence of dissipative processes fluctuations will maintain their sizes and densities thereafter. Perturbations which are overdense by more than some critical amount at the epoch of matter-radiation density equality will have recollapsed before the epoch with redshift . The critical amplitude is approximately (cf. Kolb & Turner 1989). Here is the redshift of matter-radiation density equality.

One can probe the initial density perturbation field by determining the fraction of space which is overdense by the critical amount when averaged over a spherical volume (or, equivalently, on a mass scale ). Presently most structure formation scenarios assume the existence of Gaussian density perturbations as the seeds for the cosmic structure. In this case, the probability to find a region of mass scale to be overdense by more than the critical amount is simply given by an integral over the tail of a Gaussian distribution function

The width of this Gaussian distribution function, , can be inferred from the statistics of the Fourier modes of the density distribution. For Gaussian random fluctuations the magnitude of the amplitudes of the Fourier modes are distributed according to a half-Gaussian. The follow a flat distribution in the interval so that different Fourier modes have uncorrelated phases . In this case, the overdensity at some coordinate is the sum over several uncorrelated terms. By the central limit theorem the distribution function for will then be a Gaussian centered around value zero.

If one wishes to determine the average density on a mass scale (or in a volume ) only those Fourier modes should be added which have wavelengths larger than the characteristic dimension of the volume. Such a prescription corresponds to sharp -space filtering. It is expected that the contributions from Fourier modes which have wavelengths shorter than the characteristic dimension of the volume will average out to zero.

The variance of the resultant distribution function for can be computed from the variance of the half-Gaussian distribution function for . Here the square brackets denote an ensemble average. The ensemble includes different universes with different randomly chosen and distributed according to the half-Gaussian- and flat distribution functions, respectively. Note, that we assume implicitly that an ensemble average is equivalent to a spatial average. Then

In this expression counts the number of different Fourier modes and is the length of the fundamental cube of the simulation. (Results will emerge independent from .) Bond et al. (1991) have noted that equation (2a) is completely analogous to computing the mean-square-distance (corresponding to ) which a particle random walking in one dimension will have moved away from it’s initial position, given the expectation value of the square of the particle’s mean free path (corresponding to ) and the number of scatterings (corresponding to the number of Fourier modes). In particular .

The sharp -space cutoff in equation (2a) is determined by the mass of the probing volume

The relationship between the cutoff wavenumber and the mass is ambiguous up to factors of unity. This is because it is not evident if the diameter of a spherical volume should be set equal to either the cutoff wavelength or, for example, half the cutoff wavelength. This ambiguity results in an uncertainty of what mass should ultimately be associated with an initially overdense region. It does, however, not affect the shape of the resulting mass function. Others have chosen a value of (Lacey & Cole 1994).

One often assumes a featureless power-law for the variance of the magnitude of the Fourier amplitude, i.e. . In this case, equation (2a) yields

Here is the mass scale of unity variance . The coefficient and the exact relationship between mass and cutoff momentum are included in .

In specific scenarios of structure formation and if one desires to work over extended ranges of mass the featureless power-law should be replaced by the appropriate functional form for which, in general, is more complicated. The dependence of on determined at the epoch of matter-radiation density equality for a variety of cosmic scenarios can, for example, be taken from the paper by Holtzmann (1989). Equation (2a) can then be numerically integrated to yield an analogous relationship to equation (3).

It is now our goal to compute the number density of isolated regions with an overdensity exceeding the critical one, . Isolated regions overdense by or more are regions which are not included in yet larger regions overdense by or more. In particular, a region of mass scale will only be counted as an eventually virialized object of mass , if for any larger mass scale the average overdensity of the region is below the threshold. Consider a set of mass scales which are ordered by size . Furthermore, consider a large representive cosmic volume . The total volume, , which is found to be overdense by or more on the smallest scale is given by the number of isolated objects of scale , , times the size of an individual object’s volume, , plus the contributions from finding overdense regions on scale included in larger isolated regions . The latter contribution, for example from isolated regions on scale , is given by the product of number of isolated regions on scale , , the volume of these isolated regions, , and the probability of finding a region of scale overdense by or more, provided it is included in an isolated, overdense region of size . In the original PS-analysis it was erroneously assumed that this probability is unity. In a similar manner isolated regions of even larger sizes make contributions to .

When we add up all contributions we find

where the limiting probability is understood to be unity. Equation (4a) can be rewritten into integral form

Here is the number density of isolated and overdense regions per unit mass interval. This is exactly the desired mass function which PS intended to derive. Completely analogous equations to equations (4ab) can be written down for any mass scale . The resulting set of equations then represent a matrix equation, such that the product of matrix and vector has to equal the vector . This matrix has a form which is already suitable for Gaussian back substitution. This is because the only depend on those with and so all lower off-diagonal elements of the matrix are zero (i.e. , for ). To obtain one has, in principle, to solve an infinitely large matrix equation. In practice, however, one can obtain very good approximations to by truncating the matrix for large masses. The number density of isolated, overdense regions with large masses is exponentially suppressed so that the upper, infinite bound in equations (4ab) can be replaced by some large mass .

To fully appreciate the modification to the original PS-formalism one can take the derivative of equation (4b) with respect to mass . This yields

Note, that by neglecting the second term on the right-hand side of equation (5), the original PS-prescription is recovered. However, this term does not vanish since the probability does vary with .

The probability to find a region of size overdense by or more when it is included in a larger region overdense by or more can be computed in the following way. The overdensity of the large region is determined by the sum of the amplitudes of all the Fourier modes between and the cutoff , where the cutoff is computed from equation (2b) for the mass . Substructure within the large region, that is the overdensity (underdensity) for any smaller region contained within relative to the overdensity , is determined by the sum of the amplitudes of all the Fourier modes which have wave numbers between and . Here is computed for the smaller mass . The probability distribution for the overdensity of the smaller region will follow a Gaussian distribution centered at the larger region’s overdensity . If one assumes a power-law for the variances the width of this Gaussian can be obtained from equation (2a). This is accomplished if one replaces the lower integral limit in equation (2a) by

The probability is then computed by a double integral. The first integral is performed over all possible larger region’s overdensities having . The second integral computes for any given the probability of finding a smaller region’s overdensity to exceed the threshold as well. This latter integral is obtained by integrating over the Gaussian distribution of the smaller region’s overdensity. We find

where the constant assures proper normalization

The width of the larger region’s overdensity distribution is given by equation (3) with replaced by . In the limit the width approaches zero and the second integral in eq(7a) becomes a delta function. This yields a limiting probability of . In the opposite limit when approaches zero the probability approaches .

It is the goal to solve the implicit equation (5) for . We have derived a rather lengthy expression for which, however, is more suitable for numerical integration than equation (7a)

Here erfc() is the complimentary error function

The derivative can be obtained from equation (6).

It is now straightforward to solve equation (5) for with the help of equation (1) and equation (8). This is done by assuming a very large mass for which is exponentially small. For such a large mass one obtains a good approximation to by only considering the first term on the right-hand-side of equation (5). To compute for smaller masses the full equation (5) should be employed. This is easily done if one derives for successively smaller masses by using previously determined for all the larger masses. In this way the shape of the original PS-mass function will be altered.

We have numerically solved equation (5) to obtain the multiplicity function. (We will give an analytic approximation below.) The multiplicity function is the fraction of matter contained in virialized objects of mass per unit logarithmic mass interval . The multiplicity function is related to by

Results for multiplicity functions are presented in figure (1). In this figure, the upper panel shows the multiplicity function for a spectral index , whereas the lower panel shows the multiplicity function for a spectral index of . The heavy solid lines show the renormalized multiplicity functions derived in this paper. For comparison the light solid lines represent the original PS-results. The figure also shows multiplicity functions inferred from N-body simulations. These have been taken from Efstathiou et al. (1988) (cf. figs. 9 in Efstathiou et al.). The multiplicity functions inferred from N-body simulations are shown for the last seven output times of the simulation and are scaled according to self-similarity of the multiplicity functions at different times. The up-turn of the multiplicity functions for lower multiplicities or, equivalently, masses arises from the limited resolution of the N-body simulations.

The figure illustrates that the multipicity functions derived from equation (5), which include the solution of the cloud-in-cloud problem, provide a much better approximation of the N-body results than the original PS-multiplicity functions. Over the limited range of masses (multiplicities) shown in figure (1) the differences between the PS-result and the result from equation (5) are within a factor of two.

In figures (2abc) we show the multiplicity functions on a double-logarithmic plot over an extended mass range. The solid lines represent the renormalized multiplicity functions, whereas the dashed lines show the original PS multiplicity functions. It is evident that the differences between these two multiplicity functions can be as large as an order of magnitude at the small-mass scale side of the distribution. In particular, the PS-analysis systematically underestimates the number densities of small mass objects. This discrepancy between our result and the PS-approximation is particularly large for a positive spectral index. On the large-mass scale end of the distribution the PS multiplicity functions overestimate the number densities of objects by a factor of two independent of the spectral index. However, this factor-of-two difference can be easily absorbed into the uncertainties associated with the absolute mass scale of objects (cf. equation 2b). This is because there is no change in the shape of the multiplicity functions at the rare, massive-object side of the distribution and multiplicity functions decrease very rapidly for large masses.

The modification of the mass distribution functions presented here does not affect the self-similarity of mass distribution functions at different times in hierarchical structure formation. In particular, as described by PS the shapes of the mass functions are independent of epoch or, equivalently, threshold . Mass functions differ only for different by an increase in the characteristic mass scale of the distribution functions with decreasing threshold .

The formalism developed naturally assigns all the cosmic material to overdense regions which will eventually form virialized objects. Thus, the missing half of cosmic material, which could not be accounted for in the PS-formalism, is accounted for once a solution to the cloud-in-cloud problem is known. This can be seen by considering a mass scale very much smaller than the characteristic mass scale of the peak of the multiplicity function. In the limit when approaches zero both and approach one-half. With the help of equation (4b) one can then infer

so that the computed will automatically have the proper normalization.

Our results are in direct contradiction with the results obtained by Bond et al. (1991) and Peacock & Heavens (1990). Their analysis confirmed the original PS-result. Bond et al. consider a set of different-sized mass scales. For each point in space they determine the largest of these mass scales for which the region around the point is overdense by the critical amount. They then associate such a region with an eventually self-bound object of mass . They perform such an analysis independently for each space point by considering all possible realizations of Fourier-mode amplitudes. Their analysis is incorrect, however, since it does not allow for correlations between neighboring space points. In other words, if their analysis finds an object of mass at point it will also find an object of mass at point and it will count both as separate objects. In reality, the object of mass at will have a finite size and will include the space point at , so that only one object of mass should be counted. Note, that our treatment implicitly includes spatial correlations by the determination of the probability function .

We have not been able to derive an analytic solution for . It is not always convenient to have to invert a matrix for a determination of multiplicity functions and number densities of virialized objects. We therefore obtained an analytic approximation for

with

Here is the original PS-result, in particular

The multiplicity functions derived by using the analytic approximation equation (11a) are shown in figures (2abc) by the dotted lines. Over the whole range of masses shown in these figures and for all three spectral indices equation (11a) approximates the renormalized number densities by better than ten per cent. If higher accuracies are desired a numerical treatment has to be performed.

3. Conclusions

We have reanalyzed the Press-Schechter formalism which is often used to estimate number densities of virialized objects from the statistics of the primordial density perturbations. Our result addresses the importance of the cloud-in-cloud problem, in particular the correct count of underdense regions which are embedded within overdense regions. In contrast to previous work we find that by considering a proper solution to the cloud-in-cloud problem mass distribution functions are modified. These renormalized mass distribution functions provide an excellent approximation of mass distribution functions inferred from N-body simulations. Our analysis shows that the original PS-formalism systematically overestimates the number of rare, massive objects by a factor of two, while the number of low-mass objects are underestimated by up to an order of magnitude. The approach described here avoids this.

4. Acknowledgements

The author is grateful to G.J. Mathews for many inspiring discussions. The author wishes also to acknowledge discussions with S. Charlot and G. Kauffmann. This work was performed under the auspices of the U.S. Department of Energy by the Lawrence Livermore National Laboratory under contract number W-7405-ENG-48 and DoE Nuclear Theory Grant SF-ENG-48.

5. References

Bardeen, J.M., Bond, J.R., Kaiser, N., & Szalay, A. S., 1986, ApJ, 304, 15 Bartlett, J.G. & Silk, J. 1993, ApJ (Letters), 407, L45 Bond, J.R., Cole, S., Efstathiou, G., & Kaiser, N. 1991, ApJ, 379, 440 Bower, R.G. 1991, MNRAS, 248, 332 Carlberg, R.G. 1990, ApJ, 350, 505 Cole, S. & Kaiser, N. 1989, MNRAS, 237, 1127 Efstathiou, G., Frenk, C.S., White, S.D.M., Davis, M. 1988, MNRAS, 235, 715 Holtzman, J.A. 1989, ApJS, 71, 1 Kauffmann, G. & White, S.D.M. 1993, MNRAS, 261, 921 Kauffmann, G., White, S.D.M., & Guiderdoni, B. 1993, MNRAS, 264, 201 Kauffmann, G. & Charlot, S. 1994, ApJ, submitted Kolb, E.W. & Turner, M.S. 1989, in The Early Universe, Addison-Wesley, pg 327 Lacey, C. & Cole, S. 1993, MNRAS, 262, 627 Lacey, C. & Cole, S. 1994, MNRAS, submitted Mo, H.J. & Miralda-Escudé, J. 1994, ApJ, submitted Narayan, R. & White, S.D.M. 1988, MNRAS, 231, 97p Nusser, A. & Silk, J. 1993, ApJ (Letters), 411, L1 Peacock, J.A. & Heavens, A.F. 1985, MNRAS, 217, 805 Peacock, J.A. & Heavens, A.F. 1990, MNRAS, 243, 133 Press, W.H. & Schechter, P. 1974, ApJ, 187, 425 White, S.D.M. & Frenk, C.S. 1991, ApJ, 379, 25 Zhan, Y. 1990, ApJ, 355, 387

6. Figure Captions

Figure 1The multiplicity function, i.e. the fraction of cosmic material contained in virialized and isolated objects of mass per unit logarithmic mass interval (cf. equation 9) as a function of . The absolute value of mass on the ordinate is arbitrary. The heavy solid line shows the result derived in this paper. The light solid line shows the original PS-result. For comparison the solid (up-turning), dotted, short-dashed, long-dashed, short-dashed dotted, long-dashed dotted, and short-dashed long-dashed lines give results from N-body simulations. The N-body results have been taken from Efstathiou et al. (1988). The upper panel shows the multiplicity function for a power-law spectral index of for the variance of the Fourier mode amplitudes of the density distribution, whereas the lower panel shows the multiplicity function for a spectral index of .

Figure 2aThe multiplicity function presented double-logarithmically as a function of rescaled mass . The solid line shows the renormalized multiplicity function, whereas the dashed line shows the original PS-multiplicity function. The dotted line represents the multiplicity function computed by using the analytic approximation equation (11). The calculation assumes a spectral index of .

Figure 2bSame as figure (2a), but here a spectral index of is assumed.

Figure 2cSame as figure (2a), but here a spectral index of is assumed.