I am testing the correlation between two physiological parameters in plants using the pic function in R. I am a bit stuck on the interpretation of the phylogenetic independent contrast (PIC). Without considering the PIC, I got a significant correlation, but there is no correlation between the PIC. What does the absence of correlation between the PIC mean:

- the correlation without PIC is the effect of phylogeny

OR,

- There is no phylogenetic effect on the correlation.

Thank you

Your first option is the correct interpretation. The correlation you observe in the raw data is the effect of phylogeny.

Without using PICs, you might expect to see phylogenetic autocorrelation, i.e. your raw data correlate as a function of the amount of shared evolutionary history between different species. Using PICs is one way of testing whether the correlation still exists when the effect of phylogenetic autocorrelation is removed. This is exactly what you observe-you see the correlation in raw data (which are not phylogenetically independent) but it disappears when you use PICs. Without knowing any details of your study beyond what's in the question, it sounds like a textbook case of phylogenetic autocorrelation.

The PIC method originates with Felsenstein (1985). I'd encourage checking that paper out for further details on the method and interpretation of the results.

## Correlation (Pearson, Kendall, Spearman)

**Correlation** is a bivariate analysis that measures the strength of association between two variables and the direction of the relationship. In terms of the strength of relationship, the value of the correlation coefficient varies between +1 and -1. A value of ± 1 indicates a perfect degree of association between the two variables. As the correlation coefficient value goes towards 0, the relationship between the two variables will be weaker. The direction of the relationship is indicated by the sign of the coefficient a + sign indicates a positive relationship and a – sign indicates a negative relationship. Usually, in statistics, we measure four types of correlations: Pearson correlation, Kendall rank correlation, Spearman correlation, and the Point-Biserial correlation. The software below allows you to very easily conduct a correlation.

Get Your Dissertation Approved

We work with graduate students every day and know what it takes to get your research approved.

- Address committee feedback
- Roadmap to completion
- Understand your needs and timeframe

### Quantitative Results in One-Hour

** Pearson r correlation:** Pearson

*r*correlation is the most widely used correlation statistic to measure the degree of the relationship between linearly related variables. For example, in the stock market, if we want to measure how two stocks are related to each other, Pearson

*r*correlation is used to measure the degree of relationship between the two. The point-biserial correlation is conducted with the Pearson correlation formula except that one of the variables is dichotomous. The following formula is used to calculate the Pearson

*r*correlation:

*r _{xy}* = Pearson r correlation coefficient between x and y

*n*= number of observations

*x*= value of x (for ith observation)

_{i}*y*= value of y (for ith observation)

_{i}**Types of research questions a Pearson correlation can examine:**

Is there a statistically significant relationship between age, as measured in years, and height, measured in inches?

Is there a relationship between temperature, measured in degrees Fahrenheit, and ice cream sales, measured by income?

Is there a relationship between job satisfaction, as measured by the JSS, and income, measured in dollars?

For the Pearson *r* correlation, both variables should be normally distributed (normally distributed variables have a bell-shaped curve). Other assumptions include linearity and homoscedasticity. Linearity assumes a straight line relationship between each of the two variables and homoscedasticity assumes that data is equally distributed about the regression line.

*Conduct and Interpret a Pearson Correlation*

Nc= number of concordant

Nd= Number of discordant

**Conduct and Interpret a Kendall Correlation**

**Concordant:** Ordered in the same way.

**Discordant:** Ordered differently.

*Spearman rank correlation***:** Spearman rank correlation is a non-parametric test that is used to measure the degree of association between two variables. The Spearman rank correlation test does not carry any assumptions about the distribution of the data and is the appropriate correlation analysis when the variables are measured on a scale that is at least ordinal.

The following formula is used to calculate the Spearman rank correlation:

ρ= Spearman rank correlation

di= the difference between the ranks of corresponding variables

n= number of observations

**Types of research questions a Spearman Correlation can examine:**

Is there a statistically significant relationship between participants’ level of education (high school, bachelor’s, or graduate degree) and their starting salary?

Is there a statistically significant relationship between horse’s finishing position a race and horse’s age?

**Assumptions**

The assumptions of the Spearman correlation are that data must be at least ordinal and the scores on one variable must be monotonically related to the other variable.

*Conduct and Interpret a Spearman Correlation*

**Effect size:** Cohen’s standard may be used to evaluate the correlation coefficient to determine the strength of the relationship, or the effect size. Correlation coefficients between .10 and .29 represent a small association, coefficients between .30 and .49 represent a medium association, and coefficients of .50 and above represent a large association or relationship.

**Ordinal data:** In an ordinal scale, the levels of a variable are ordered such that one level can be considered higher/lower than another. However, the magnitude of the difference between levels is not necessarily known. An example would be rank ordering levels of education. A graduate degree is higher than a bachelor’s degree, and a bachelor’s degree is higher than a high school diploma. However, we cannot quantify how much higher a graduate degree is compared to a bachelor’s degree. We also cannot say that the difference in education between a graduate degree and a bachelor’s degree is the same as the difference between a bachelor’s degree and a high school diploma.

**Correlation Resources:**

Algina, J., & Keselman, H. J. (1999). Comparing squared multiple correlation coefficients: Examination of a confidence interval and a test significance. *Psychological Methods, 4*(1), 76-83.

Bobko, P. (2001). *Correlation and regression: Applications for industrial organizational psychology and management* (2nd ed.). Thousand Oaks, CA: Sage Publications. View

Bonett, D. G. (2008). Meta-analytic interval estimation for bivariate correlations. *Psychological Methods, 13*(3), 173-181.

Chen, P. Y., & Popovich, P. M. (2002). *Correlation: Parametric and nonparametric measures*. Thousand Oaks, CA: Sage Publications. View

Cheung, M. W. -L., & Chan, W. (2004). Testing dependent correlation coefficients via structural equation modeling. *Organizational Research Methods, 7*(2), 206-223.

Coffman, D. L., Maydeu-Olivares, A., Arnau, J. (2008). Asymptotic distribution free interval estimation: For an intraclass correlation coefficient with applications to longitudinal data. *Methodology, 4*(1), 4-9.

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). *Applied multiple regression/correlation analysis for the behavioral sciences*. (3rd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. View

Hatch, J. P., Hearne, E. M., & Clark, G. M. (1982). A method of testing for serial correlation in univariate repeated-measures analysis of variance. *Behavior Research Methods & Instrumentation, 14*(5), 497-498.

Kendall, M. G., & Gibbons, J. D. (1990). *Rank Correlation Methods* (5th ed.). London: Edward Arnold. View

Krijnen, W. P. (2004). Positive loadings and factor correlations from positive covariance matrices. *Psychometrika, 69*(4), 655-660.

Shieh, G. (2006). Exact interval estimation, power calculation, and sample size determination in normal correlation analysis. *Psychometrika, 71*(3), 529-540.

Stauffer, J. M., & Mendoza, J. L. (2001). The proper sequence for correcting correlation coefficients for range restriction and unreliability. *Psychometrika, 66*(1), 63-68.

## Localized Single-Voxel Magnetic Resonance Spectroscopy, Water Suppression, and Novel Approaches for Ultrashort Echo-Time Measurements

Hongxia Lei , . Vladimír Mlynárik , in Magnetic Resonance Spectroscopy , 2014

### Slice-Selective 90° Pulses

Slice-selective RF excitation pulses must be played concurrently with a corresponding slice-selection gradient. For localized 1 H MRS, excitation pulses should produce a uniform flip angle (e.g., 90°) within the desired slice. A band-selective RF pulse is prepared by multiplying the pulse envelope by a carrier signal. The carrier is a sinusoidal waveform oscillating at precisely the desired frequency (e.g., the proton resonance frequency at a specific magnetic field strength).

Fig. 1.2.2 illustrates three excitation pulses: gauss, sinc, and asymmetric ( Tkáč et al., 1999 ). Note that both sinc and asymmetric pulses give better slice profiles than the gauss pulse when all pulse amplitudes are adjusted to a flip angle of 90° on resonance ( Table 1.2.1 ). Each pulse requires its own peak amplitude (*γ*B_{1}/2π in Hz) to produce the desired flip angle.

Table 1.2.1 . Comparison of Three 90° RF Pulses

RF pulse (2 ms) | 90° flip angle amplitude (γB_{1}/2π, Hz) | Bandwidth (kHz) | % of pulse length to TE |
---|---|---|---|

Gauss | 300 | 1.25 | 50 |

Sinc | 700 | 3.0 | 50 |

Asymmetric | 830 | 3.0 | 20–30 |

Here, both the asymmetric and the sinc pulse (five-lobe) require higher power, but provide a broader bandwidth and a more uniform excitation profile than the gauss pulse. It also clearly shows that the peak RF amplitude (B_{1}(*t*)) of the asymmetric pulse lies in the last quarter of the entire pulse length ( Fig. 1.2.1c and Table 1.2.1 ). Consequently, only a small portion of the entire pulse length contributes to the echo time (TE) hence, it is shorter (see the section, Basic localization 1 H MRS methods).

Unlike the sinc pulse, the slice profile of the asymmetric pulse consistently presents positive sidebands along the excitation direction. Thus, a signal outside the excitation band is not partially canceled out and OVS is recommended to eliminate contamination (see the section, Factors affecting spectral quality).

## Bar Graphs

As we have seen throughout this book, **bar graphs** are generally used to present and compare the mean scores for two or more groups or conditions. The bar graph in Figure 12.11 is an APA-style version of Figure 12.4. Notice that it conforms to all the guidelines listed. A new element in Figure 12.11 is the smaller vertical bars that extend both upward and downward from the top of each main bar. These are **error bars** , and they represent the variability in each group or condition. Although they sometimes extend one standard deviation in each direction, they are more likely to extend one standard error in each direction (as in Figure 12.11). The **standard error** is the standard deviation of the group divided by the square root of the sample size of the group. The standard error is used because, in general, a difference between group means that is greater than two standard errors is statistically significant. Thus one can “see” whether a difference is statistically significant based on a bar graph with error bars.

Figure 12.11 Sample APA-Style Bar Graph, With Error Bars Representing the Standard Errors, Based on Research by Ollendick and Colleagues [Long Description]

These authors contributed equally: Alex D. Washburne and James T. Morton.

### Affiliations

Department of Microbiology and Immunology, Montana State University, Bozeman, MT, USA

Department of Computer Science, University of California San Diego, La Jolla, CA, USA

James T. Morton & Rob Knight

Department of Pediatrics, University of California San Diego, La Jolla, CA, USA

James T. Morton, Jon Sanders, Daniel McDonald, Qiyun Zhu & Rob Knight

Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO, USA

Cooperative Institute for Research in Environmental Sciences, University of Colorado, Boulder, CO, USA

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

### Corresponding author

## Discussion

Here we studied a broad range of ecosystems to assess patterns of microbial community composition and the processes that underlie these patterns. To do so, we have characterized patterns of spatial turnover in the phylogenetic composition of microbial communities, and have inferred processes by comparing phylogenetic turnover to null model expectations. This pattern-to-process linkage requires phylogenetic signal in microbial habitat associations. Compared with previous microbial studies, we used a more statistically robust method to test phylogenetic signal. We used this updated method to provide the broadest evaluation of phylogenetic signal in microbes to date, and find significant phylogenetic signal across all evaluated habitats. Further, we used a novel statistical approach to test for deterministic processes governed by *unmeasured* environmental variables. The ability to detect deterministic influences by *unmeasured* environmental variables proved critical for understanding the relative balance between deterministic and stochastic processes.

### Phylogenetic signal varies with phylogenetic distance

Inferring ecological processes using phylogenetic information requires phylogenetic signal (Losos, 2008) in ecological niches (Cavender-Bares et al., 2009). Recent studies on freshwater *Actinobacteria* (Newton et al., 2007), marine bacterioplankton (Andersson et al., 2010) and subsurface bacteria (Stegen et al., 2012) have indicated there is a positive relationship between phylogenetic distances and ecological differences among close relatives. Our Mantel correlogram analyses support this finding: significant phylogenetic signal was consistently detected across all studied habitat types, but only across short phylogenetic distances. This is also supported by regressing habitat differences against phylogenetic distances between pairs of OTUs (the same procedures in Stegen et al., 2012), which also showed a clear positive relationship across the short phylogenetic distances for all sample groups (Supplementary Figure S5). More quantitatively, significant phylogenetic signal was found at up to 10–30% of the maximum observed phylogenetic distance. This is consistent with Stegen et al. (2012) who found up to 13–15% of the maximum phylogenetic distance for terrestrial subsurface bacteria. This general pattern in phylogenetic signal strongly indicates that closely related bacterial taxa are ecologically coherent and that interspecies gene exchange, such as horizontal gene transfer (Popa and Dagan, 2011), does not eliminate such ecological coherence at the scale of bacterial metacommunities (see also in Philippot et al., 2010 Wiedenbeck and Cohan, 2011 Stegen et al., 2012).

Unexpectedly, across intermediate distances there were significant negative correlations between phylogenetic and ecological distances. This may suggest convergent evolution, that is, that distinctly related lineages acquire the similar ecological niches, across intermediate phylogenetic distances. We are unaware of other work showing convergent evolution across free-living bacteria, but the same genes are often lost in obligate intracellular bacteria from different phyla, suggesting evolutionary convergence (Merhej et al., 2009). More generally, evolutionary convergence may have a role in common functions for complex symbiont communities across phylogenetically divergent hosts (Fan et al., 2012). However, the causes and consequences of convergent evolution in free-living microbial communities are unclear, but warrant further study.

Taken together, our results combined with previous studies, indicate a general pattern in the phylogenetic structure of bacterial ecological niches: conserved niches/traits across short phylogenetic distances, convergent niches/traits across intermediate distances and random niches/traits across large distances. As functional and phylogenetic beta diversity for soil microbes were closely correlated (Fierer et al., 2012), and there is a phylogenetic signal for 93% functional traits in micro-organisms (Martiny et al., 2012), it would be interesting to use other molecular markers (functional genes, for instance) to test phylogenetic signal at finer phylogenetic scales within a hierarchy of environmental factors (see Martiny et al., 2009). Nevertheless, this observation has two important implications. First, strong phylogenetic signal across short phylogenetic distances indicates that ecological processes can be inferred by studying spatial or temporal patterns in the phylogenetic structure of communities. Second, it suggests that ecological inferences are most robust when made using metrics of nearest neighbor distances (for example, betaMNTD). These metrics focus on relatively short phylogenetic distance such that phylogenetic structure carries ecologically relevant information.

### Turnover rate in community composition varies among habitats

Our results showed much greater turnover in community composition between habitats than within habitats. This suggests that bacteria are specialized on particular habitats and is consistent with former meta-analyses on bacteria (for example, Lozupone and Knight, 2007 Delmont et al., 2011 Nemergut et al., 2011 Zinger et al., 2011).

Within habitats, unweighted Unifrac and betaMNTD both showed significant distance–decay patterns across six out of the nine habitat types (67%) studied here. This is consistent with Hanson et al. (2012), who found that microbial communities showed significant spatial patterns in 68% of 54 reviewed data sets. In addition, there was substantial across-habitat variation in the rate of spatial distance–decay. As expected, the turnover rate in phylogenetic community composition was highest for shallow terrestrial subsurface environments (LG and KS). This high rate of turnover in phylogenetic community composition is consistent with a previous observation of high turnover rate in taxonomic composition in a terrestrial subsurface environment (Wang et al., 2008). Such high turnover rates may be explained by strong dispersal limitation and steep environmental gradients in subsurface environments (Wang et al., 2008).

In amphibian, bird, mammal or plant assemblages, beta diversity is typically higher in mountainous regions than in regions with less topographic relief, presumably due to species specializing on particular elevations (for example, McKnight et al., 2007). Our results are consistent with this observation: turnover rate was significantly higher for the biofilm bacterial communities in mountainside streams than for other habitat types (except subsurface environments). This high elevational turnover rate of bacteria is also consistent with the results for diatoms and macroinvertebrates in the same streams (Wang et al., 2012b). On the other hand, high turnover across elevations for bacteria is somewhat different from a previous result obtained using denaturing gradient gel electrophoresis along the same elevational gradient, which showed no significant elevational distance–decay relationship (Wang et al., 2012b). This difference may have resulted from different resolution of the two methods: a method with lower resolution may fail to detect a significant distance–decay relationship because of undetected endemism (Morlon et al., 2008 Hanson et al., 2012).

Our samples covered a wide range of horizontal spatial extents, which potentially affects the observed turnover rate. Below the spatial extent of 10 km, we did not find significant distance–decay in KL1, KL7 or HX sample groups. At a spatial extent of<100 km, as within habitats of Taihu Lake (ThS and ThW) for instance, we found significant differences in turnover rates across-habitat types: bacterial communities in surface sediments showed a significantly higher turnover rate than their corresponding free-living communities (1.4 and 1.0 unweighted Unifrac per 10 3 km, respectively). When larger spatial extents were considered (> 100 km), the lakes from mountain regions (SC) for instance, the sediment bacterial communities showed a significantly lower turnover rate than other habitats, especially within habitats (that is, ThS) (Figure 3).

Previous work has also found the distance–decay relationship for microbes to be scale dependent, where significant relationships occurred only across local or relatively short spatial extents (for example, King et al., 2010 Martiny et al., 2011). However, for larger spatial extents similar to those we considered here, former reports indicate that the bacterial distance–decay relationships range from significant in lake surface sediments across the Tibetan Plateau (Xiong et al., 2012) to nonsignificant in North America soils (Fierer and Jackson, 2006). These results collectively suggest that the rate of distance–decay in bacteria shows strong context-dependency, potentially driven by among-habitat variation in the degree of environmental spatial autocorrelation and in the degree of dispersal limitation.

Summarily, the pairwise phylobetadiversity significantly increased with spatial distance for most of the sample groups and clearly showed that among the studied habitats there were significant differences in the rates at which community composition changes through space. In general, the horizontal turnover rates of bacterial communities from lakes or soils, or from local or regional scales, were significantly lower than the rates from subsurface environments or mountain regions. In addition, the community turnover rate in subsurface environments was highest.

### Deterministic processes govern community composition across habitats

By leveraging phylogenetic information and following the three-step procedure proposed here, we inferred the relative influences of deterministic and stochastic processes across a broad range of habitat types. The first step is to examine distributions of ses.betaMNTD a distribution mean that deviates significantly from zero suggests a strong influence of deterministic processes (Fine and Kembel, 2011 Stegen et al., 2012). In 9 of the 11 sample groups, ses.betaMNTD distributions deviated significantly from zero (Supplementary Figure S3). Furthermore, seven groups showed distributions greater than zero (Supplementary Figure S3), suggesting that across communities there are shifts in environmental conditions that deterministically cause changes in community composition. Two groups from Taihu Lake (ThS and ThW) had mean ses.betaMNTD values that were less than zero (Supplementary Figure S3), suggesting that for both groups there was relative consistency in the environmental conditions that deterministically governed community composition. Although many features of the observed abiotic environment in Taihu Lake varied across sampled communities, a high level of eutrophication was maintained across communities (Duan et al., 2009). It may therefore be that high eutrophication imposed strong environmental filtering on microbial communities in Taihu Lake. More generally, our observation of ses.betaMNTD values ranging from negative to null to positive highlights the fact that the influence of deterministic ecological processes varies across systems deterministic processes can minimize spatial variation in, have little influence over, or drive large shifts in community composition. A major challenge for future work is to mechanistically understand variation in the influence of deterministic processes.

The second step in the process-inference procedure focuses on revealing which process is the primary cause of significant, partial Mantel coefficients relating turnover in community composition to spatial distance. A common interpretation is that such a relationship is caused by stochastic processes. Although intuitive, this interpretation may be premature, especially if turnover in community composition is quantified using a observed or ‘raw’ metric a raw metric such as betaMNTD simply measures the difference in composition between two communities.

Consider a scenario in which the turnover in community composition is quantified using betaMNTD and in which there is an *unmeasured* environmental variable that changes across sampled microbial communities. If this *unmeasured* variable governs community composition, it can cause a significant partial Mantel coefficient relating betaMNTD to spatial distances. The standard (and incorrect) inference would be that community composition is governed by stochastic processes.

To make a more robust inference we use ses.betaMNTD as the turnover metric. In this case, the partial Mantel coefficient related to spatial distance should reflect deterministic processes governed by *unmeasured* environmental variables. The reason is twofold: (i) the influence of measured environmental variables has already been accounted for because we are dealing with partial Mantel coefficients and (ii) the magnitude of phylogenetic null model departures (that is, ses.betaMNTD) should only be influenced by deterministic processes stochastic processes should have no influence (Hardy, 2008).

When using ses.betaMNTD as the turnover metric, a significant, partial Mantel coefficient related to spatial distances should indicate that stochastic processes are overwhelmed by deterministic processes governed by *unmeasured* environmental variables. Similarly, if the partial Mantel coefficient is nonsignificant, it would indicate that *unmeasured* environmental variables have little influence over community composition.

In ThS, KS and STR groups, spatial distances were significantly related to ses.betaMNTD after controlling for measured environmental distances (Table 2). We therefore infer that in these habitats there are *unmeasured*, spatially structured environmental variables that influence community composition by imposing deterministic processes. We also infer that deterministic processes imposed by *unmeasured* variables overwhelm any influences of stochastic processes. In contrast, for the other six groups, *unmeasured* environmental variables have little influence. For these six groups, a significant influence of stochastic processes can be indicated by a significant relationship between spatial distance and Unifrac or betaMNTD (after controlling for measured environmental variables).

The third step in our process-inference procedure evaluates the relative balance between deterministic and stochastic processes. To begin, we infer a greater influence of deterministic processes for the three groups characterized by significant influences of measured and *unmeasured* environmental variables (ThS, KS and STR). This inference assumes that if stochastic processes were more influential than deterministic processes, spatial distances alone would not be related to ses.betaMNTD stochastic processes acting alone should cause there to be no relationship between ses.betaMNTD and spatial distance.

For the six sample groups in which ses.betaMNTD was not related to spatial distances (by partial Mantel), we use the relative magnitudes of partial Mantel coefficients from the analyses of Unifrac and betaMNTD. The reason we use these observed or ‘raw’ beta diversity metrics instead of ses.betaMNTD is because they can increase with an increased influence of stochastic processes. That is, increased stochasticity should increase taxonomic turnover and increased taxonomic turnover should, by itself, cause increases in phylogenetic turnover. For five of the six sample groups, the partial Mantel coefficient was larger for environmental distance than for spatial distance (Supplementary Tables S1 and S2). We take this as evidence that deterministic processes are more influential than stochastic processes in these five groups. The exception was the HX group, which was characterized by nonsignificant partial Mantel tests, which does not provide any clear ecological inferences.

It is worth noting that some inferences drawn here are opposite to those which we would have made using ‘traditional’ approaches without null models. When a higher partial Mantel coefficient is observed for spatial distance than for environmental distance, the standard inference is that stochastic processes have a stronger influence than deterministic processes. Using this approach (for example, for KS and STR), partial Mantel coefficients based on betaMNTD would suggest a stronger influence of stochastic processes (Supplementary Table S2).

Considering that ses.betaMNTD is significantly related to spatial distance in KS and STR, however, implies that the larger coefficient on spatial distance is actually driven by *unmeasured* environmental variables that deterministically govern community composition. This reverses the inference from the dominance of stochastic processes to the dominance of deterministic processes. We suggest that the approach used here provides more informed inferences than the standard Mantel framework. It would be informative to apply the approach to the previously studied microbial and macro-organism systems. Substantial changes in our new understanding of these systems may result. We stress, however, that both approaches are important and the relative utility of each will depend on the context and questions of interest.

## Discussion

The present study aimed to relate phylogenetic alpha and beta dispersion, and functional alpha and beta dispersion across a series of forest plots in order to infer mechanisms of community assembly in six forest dynamics plots. Specifically, we quantified the phylogenetic and trait dissimilarity of individuals within forest subplots and compared that value to the phylogenetic and trait dissimilarity of all individuals between subplots using the framework presented in Fig. 1. This was done using pairwise metrics of alpha and beta dispersion, as well as nearest-neighbor metrics of alpha and beta dispersion.

### Phylogenetic and functional alpha and beta dispersion: pairwise metrics

The pairwise values were highly correlated with high phylogenetic turnover between subplots being related to high dispersion within subplots and low phylogenetic turnover and low within plot dispersion being related. This axis could be envisioned in terms of a stress gradient assembly mechanism where low local dispersion and low turnover occurs in relatively harsh and spatially contiguous habitats, and high dispersion and high turnover occurs in more benign and potentially patchy habitats (Helmus and Ives 2012). The relative proportion of subplots falling on either end of this spectrum was generally equivalent. The exception to this was the BCI forest plot, where more subplots were phylogenetically overdispersed with high phylogenetic dissimilarity between a subplot and its neighboring subplots. We should note that the BCI results may, in some cases, seem divergent from those previously reported from this forest (e.g., Kress et al. 2009), but we remind the reader that the present manuscript weighted all analyses by abundance, whereas previous work used presence–absence weighting. This could be taken as evidence that negative biotic interactions and among-subplot habitat heterogeneity are important for understanding the phylogenetic diversity at the scales studied in the BCI forest plot. We do caution that the present work did not directly measure abiotic filtering using environmental data. This is a weakness of the approach and could not be strengthened due to a lack of consistent and meaningful environmental data sets from all plots studied. Ideally, the inferences made here and in the rest of the discussion will be more strongly substantiated in the future when consistent and informative environmental data are available for these forests and others.

The pairwise trait metrics were similarly correlated with many traits being underdispersed locally in most plots indicating nonrandom processes structuring local communities in these forests. This result is similar to previous work in tropical forests that found strongly deterministic trait dispersion (Swenson and Enquist 2007, Kraft et al. 2008, Swenson and Enquist 2009). For example, maximum height, specific leaf area, and wood density were often clustered in local communities suggesting that abiotic filtering may increase the similarity of traits in these communities. The beta dispersion results showed a large number of subplots having little functional differentiation from one subplot to the next. For example, for the majority of traits, except seed mass, the BCI forest subplots had lower than expected trait turnover between subplots, suggesting that, although species turnover from subplot to subplot occurs, there is relatively little functional turnover. Such a pattern could result from functionally deterministic community assembly with dispersal limitation. This would be particularly expected given the relatively homogeneous topography in the BCI forest plot. A similar pattern was also uncovered in the Wabikon Lake forest plot in Wisconsin, USA, where most traits, except leaf area, had lower than expected trait turnover among subplots. Thus, the BCI result cannot be explained as a tropical phenomenon. That said, it is important to recognize that these are null modeling results and that the raw turnover may be quite high in the tropics, but not higher than that expected given the observed elevated patterns of species beta diversity and the trait pool (see Kraft et al. 2011).

It is important to note that the phylogenetic results showed local overdispersion and higher than expected phylogenetic turnover, while the majority of the trait results were the opposite. This was particularly true for the BCI forest plot and to a lesser extent the temperate plots. This suggests that there is likely substantial trait convergence between the species in the BCI forest plot community in particular, which is substantiated by the phylogenetic signal analyses we performed (Table 2). Biologically, this suggests that there is strong abiotic filtering of traits within and across subplots of this spatial scale in the BCI forest, but there is a substantial turnover of lineages from subplot to subplot that generally are functional replacements of one another. Thus, for the BCI forest, there is trait convergence, dispersal limitation of lineages, and deterministic abiotic filtering of most traits. In the other plots, there also appears to be similar trait convergence, some dispersal limitation of lineages and again a deterministic abiotic filtering of most traits. As previously noted, this study did not analyze any defense traits of the species in these plots. Previous work has shown there to be varying degrees of phylogenetic signal in plant defense (e.g., Becerra 1997, 2007, Gilbert and Webb 2007, Kursar et al. 2009, Lamarre et al. 2012), and the strength of the signal may depend on the phylogenetic breadth of the taxa being studied thus, it is not entirely clear whether results from defense trait analyses would mirror our phylogenetic results.

Aside from the biological implications of the mismatch between phylogenetic and trait results is the practical implication that measures of phylogenetic alpha and beta diversity or dispersion are not always strong predictors of functional patterns. In other words, studies of phylogenetic alpha and beta diversity alone may be poor predictors of the actual functional alpha and beta diversity (Swenson 2011*a*, Swenson et al. 2012). Thus, as many others have stressed, phylogenetic relatedness is not always a good predictor of species similarity, and assembly studies that only use phylogenetic information may be misleading.

### Phylogenetic and functional alpha and beta dispersion: nearest-neighbor metrics

The nearest-neighbor alpha and beta dispersion metrics were generally uncorrelated with one another using both phylogenetic and trait information. In all forest plots except BCI, the phylogenetic nearest-neighbor turnover was lower than that expected given the null model. This result largely contrasts with the results of the pairwise metric. This is due to large shifts in the abundance distribution from subplot to subplot, driving a large pairwise dissimilarity between subplots, but not a large nearest-neighbor turnover. In other words, species A could have 50 individuals and species B could have 4 individuals in subplot 1, while in subplot 2, the each have 2 and 75 individuals, respectively. Such a pattern would result in large pairwise dissimilarity, but no nearest-neighbor dissimilarity. The phylogenetic nearest-neighbor alpha dispersion results ranged from strongly overdispersed to strongly clustered depending on the forest plot, and there was no relationship with latitude. Thus, the one general finding was that nearest-neighbor turnover was typically lower than expected from subplot to subplot in all forests except BCI.

The nearest-neighbor alpha dispersion for most traits was lower than expected for many traits, while maximum height, leaf area, seed mass, and wood density were occasionally more diverse than expected in many of the plots (Fig. 5). This indicates abiotic filtering of some traits in some forests, and a role for biotic interactions with respect to other traits. In other words, there were no clearly defined patterns that emerged from the nearest-neighbor analyses of trait alpha dispersion. The beta trait dispersion was also inconsistent across traits and plots, making general inferences difficult. It appears that most nearest-neighbor trait dispersion results largely hovered around zero, or randomness. Thus, while local trait dispersion was constrained within a subplot, patterns of nearest-neighbor similarity between subplots cannot be easily explained and random turnover cannot be rejected.

Chloroplasts are cytoplasmic organelles and the sites of photosynthesis in eukaryotic cells. Advances in structural biology and comparative genomics allow us to identify individual components of the photosynthetic apparatus precisely with respect to the subcellular location of their genes. Here we present outline maps of four energy-transducing thylakoid membranes. The maps for land plants and red and green algae distinguish protein subunits encoded in the nucleus from those encoded in the chloroplast. We find no defining structural feature that is common to all chloroplast gene products. Instead, conserved patterns of gene location are consistent with photosynthetic redox chemistry exerting gene regulatory control over its own rate-limiting steps. Chloroplast DNA carries genes whose expression is placed under this control.

We use cookies to help provide and enhance our service and tailor content and ads. By continuing you agree to the **use of cookies** .

## Example & Setup in SPSS Statistics

The Director of Research of a small university wants to assess whether the experience of an academic and the time they have available to carry out research influences the number of publications they produce. Therefore, a random sample of 21 academics from the university are asked to take part in the research: 10 are experienced academics and 11 are recent academics. The number of hours they spent on research in the last 12 months and the number of peer-reviewed publications they generated are recorded.

To set up this study design in SPSS Statistics, we created three variables: (1) no_of_publications , which is the number of publications the academic published in peer-reviewed journals in the last 12 months (2) experience_of_academic , which reflects whether the academic is experienced (i.e., has worked in academia for 10 years or more, and is therefore classified as an "Experienced academic") or has recently become an academic (i.e., has worked in academic for less than 3 years, but at least one year, and is therefore classified as a "Recent academic") and (3) no_of_weekly_hours , which is number of hours an academic has available each week to work on research.

###### SPSS Statistics

## 2 Answers 2

The general rule of thumb (based on stuff in Frank Harrell's book, *Regression Modeling Strategies*) is that *if you expect to be able to detect reasonable-size effects with reasonable power*, you need 10-20 observations per parameter (covariate) estimated. Harrell discusses a lot of options for "dimension reduction" (getting your number of covariates down to a more reasonable size), such as PCA, but the most important thing is that in order to have any confidence in the results **dimension reduction must be done without looking at the response variable**. Doing the regression again with just the significant variables, as you suggest above, is in almost every case a bad idea.

However, since you're stuck with a data set and a set of covariates you're interested in, I don't think that running the multiple regression this way is inherently wrong. I think the best thing would be to accept the results as they are, from the full model (don't forget to look at the point estimates and confidence intervals to see whether the significant effects are estimated to be "large" in some real-world sense, and whether the non-significant effects are actually estimated to be smaller than the significant effects or not).

As to whether it makes any sense to do an analysis without the predictor that your field considers important: I don't know. It depends what kind of inferences you want to make based on the model. In the narrow sense, the regression model is still well-defined ("what are the marginal effects of these predictors on this response?"), but someone in your field might quite rightly say that the analysis just doesn't make sense. It would help a little bit if you knew that the predictors you have are uncorrelated from the well-known predictor (whatever it is), or that well-known predictor is constant or nearly constant for your data: then at least you could say that something other than the well-known predictor does have an effect on the response.