As in conventional phylogenetic analyses, issues surrounding the source data are paramount in the supertree construction, but have received insufficient attention. In supertree construction, however, the source data represent phylogenetic trees rather than primary character data. This presents several supertree-specific problems. In this paper, we examine several key data issues for supertree construction, including data set non-independence, taxonomy of terminal taxa, and the question of what constitutes a valid source tree. Throughout, we present our suggested protocol for source tree collection and manipulation based on our experiences in building a supertree of mammals. Other protocols and decisions are naturally possible. What is important is that all collection protocols are presented explicitly and address minimally the issues that we have identified. |