Main procedures of cDNA technology
On the strength of cDNA chip technology, the gene expression-related data could be sufficiently obtained from time to time, so as to lay a solid foundation on revealing the function of these genes. On the other hand, the quantity of mRNA could basically reflect the protein level of current cell, so as to facilitate the effective and systematic assessment on the protein content inside the cell.
Due to the above merits of cDNA chip, it is widely used in the research on cell metabolism, disease mechanism, drug screening, and active mechanism. For the disease diagnosis, cDNA chip is generally used in the classification of cancer. For example, Alizadeh used cDNA chip with nearly 15,000 genes to classify the huge diffusive B-cell lymphoma . Also, Okabe adopted cDNA chip with 23,040 genes to analyze the primary liver cancer tissue and normal tissue, so as to find out 165 upregulated genes and 170 downregulated genes. In addition, 19 genes with change of expression were used to distinguish the HBV-caused hepatocellular carcinoma and HCV-infected hepatocellular carcinoma .
With the fast growth of proteomics, the transcriptomics and proteomics are combined to become the new method for seeking the molecular markers. For example, Seliger  adopted cDNA chip and proteomics to search out annexin A4, tubulin α-1A chain, and ubiquitin carboxyl-terminal hydrolase L1, the candidate molecular markers of kidney cell cancer.
SNP Chip Technology
SNP (single-nucleotide polymorphism): The traditional SNP shall be examined through gel electrophoresis. The SNP chip technology introduces the in situ synthesis of single nucleotide or microprinting to orderly fix a lot of DNA segments on the surface of solid-phase supporter, so as to form the probe array which is hybridized with the labeled specimen. Here, the hybridized signals are examined to realize the fast, effective, and collateral polymorphic information analysis. The chip technology platform includes microarray, fiber film microdot, and sheet glass array chip of which the density reaches hundreds or millions of probes.
As one of the methods adopted to examine quantitative DNA, TaqMan probe chip combines the real-time examining with chip technology, so that not only the common quantitative examining, sensitivity, and real-time examining are merged together, but the parallelized analysis under high flux reactor is also considered. However, the PCR augment synchronizes TaqMan probe enzyme to cause the increase of non-specificity signals. Thus, it shall be further studied to demonstrate whether such technique is suitable to the SNP chip with low and middle densities.
Single-base extension (SBE) is mainly used for research and development of low-density chip as well as the accurate screening of randomly chosen SNP positions [28, 29]. For another single-base extension technique, the extension marker reaction is completed, and the specific type of allele is tested as per the level of matching. As compared with SNP stream, SBE primer design only demands one type of fluorescent light. Thus, examining on abrupt change of SNPs is not limited.
Ligation-rolling circle amplification (L-RCA)-based low-flux reactor chip: padlock probe cyclization of T4 joining enzyme or thermal arrest joining enzyme medium as well as point mutation of sensitive and differential DNA series. The circular probe series could be hybridized for distinguishing after rolling circle augment and enzymes. Design of L-RCA and complexity of gene order may affect the flux reactor, so that the classification of chip with high flux reactor is unable to be met.
High flux reactor chip with Golden Gate TM: Golden Gate TM achieves the highest-flux reactor for current preparation of SNP chip, so as to reach one million. However, it only demands 250 ng for the genome. This multiple inspection level could meet the demands for gene classification testing as far as possible. Thus, the precision ratio of gene classification could reach 96.64 %. However, such technique is only suitable to the SNP examining under two condition changes. Only 60 % of SNP could be detected [30, 31]. For the full-genome scanning, this technique could detect SNP label. The content of genetic information is more than that of the GeneChip Human Mapping 100 K Array chip.
With the fast growth of SNP chip technology in massive parallelism, high flux reactor, miniaturization, and automation, this technique could be used to search the new SNP positions and realize the pinpointing of SNP positions in genome. The large-scale SNP classification shall be supported by the accurate and reliable test method. However, the research and development of SNP chip technology could be the important method for molecular diagnosis, clinical examination, clinical treatment, and new drug development in the future . The development of SNP chip and gene polymorphism research hereof has not only improved the individualized medical detection technique but also provided the diagnosis basis for the use of individualized medicine, so as to facilitate the growth of small-sized diagnosing market .
DNA Methylation Test Method
DNA Methylation Chip
As one of the most common changes of epigenetics, DNA methylation is crucial to the normal cell development and texture stability. It shows that the methylation in the promoter or the first exon-extron CpG causes the deactivation of gene expression.
The examining methods for methylation marker are mainly divided into two types: examination after chemical modification and examination after modification of methylation-sensitive restriction enzymes. The first examining method was created by Frommer . Based on the former one, Gitan  developed methylation-specific oligonucleotide microarray, i.e., MSO microarray. For MSO, a pair of probe series with GC (AC) shall be designed to respectively identify the methylation and non-methylation probe series, which then are fixed on the supporter. The targeted segment is treated with bisulfite; the non-methylation cytosine is changed to uracil. When methylation is unchanged, PCR augment is conducted. The 3′-end of product shall be provided with fluorescein label and then hybridized with the probe. As the fluorescence intensity after hybridization is tested, the level of methylation in the series to be tested shall be judged. This method is one of the commonly used DNA methylation chips. As per this method, the detailed research has been conducted for the promoter area of such genes as estrogen receptor (ER), p16, and adenomatous polyposis coli (APC) [36, 37]. However, MSO is unable to obtain the data concerning each CpG position. Also, the crisscrossing may occur in the probe. As a result, the false-positive rate is high. So, the comparison shall be established.
Examination after modification of methylation-sensitive restriction enzymes includes differential methylation hybridization (DMH), methylation-sensitive arbitrarily primed PCR (MS-AP-PCR), methylated CpG island amplification (MCA), and restriction landmark genomic scanning (RLGS). DMH is used in the differential methylation hybridization of the entire genome as well as the differential methylating pedigree between cancer tissue and normal structure. It is similar to mRNA expression spectrum or cDNA array. So, it belongs to CpG island array. Cross et al.  created the affinity substrate with methylating CpG binding domain, so as to separate CpG island series from the human genome DNA. Mse I enzyme identifies TTAA position. Here, genome DNA enzyme could be cut into the segments of which the size is less than 200 bp, while the area enriched with CpG islands is not cut. Then, as both ends of enzyme segment are connected with the joint, the endoenzyme with sensitive methylation such as BstU I, Hpa II, and Hha I DNA segments could be used. DNA segment with methylation is not cut due to the protection of methylation so as to achieve the joint-PCR augment; however, the segment without methylation is unable to be enlarged. Afterwards, the fluorescence labeling, hybridization, image, and data handling procedure are exactly similar to those of expression spectrum chip . This method is simple but effective, so that it could be used to discriminate the tumor. Currently, DMH has been successfully used in examining the methylation spectrum of oophoroma. However, the popularization of this technique is restricted owning to the finiteness of enzyme location and specialty of instrument.
The different methylation spectra reflect the different phases or types of tumor. High methylation position of CpG islands is related to the occurrence of tumor. Therefore, it could be deemed as the unique marker for special tumor haplotype. Currently, it is widely used in the examining of various tumors such as non-Hodgkin’s lymphoma, lung cancer, and oophoroma. Also, high flux reactor screening of methylation could detect the abnormal gene expression mode of such diseases as malignant tumor. Thus, it is helpful to ascertain the tumor formation mechanism, so as to provide the effective monitoring and prediction on non-methylation drug reaction in chemotherapy.
Restriction Landmark Genome Scanning (RLGS)
RLGS is a high flux reactor DNA methylation parsing technique which combines MS-RE (methylation-sensitive restriction endonuclease) with the two-dimensional gel electrophoresis . The technical principle is based on the divergence of sensitivity on CpG position with or without methylation during the application of restriction enzyme. The sequence information is not demanded ahead of time. So, it is suitable to the examination on thousands of CpG islands within the entire genome, so as to seek the methylation gene of new CpG islands.
Here, the sensitive restriction enzyme with methylation could be used to learn about the methylation condition of entire gene. Also, GC-rich restriction enzyme could be used to obtain a plurality of landmarks on CpG islands near the promoter. However, the shortcoming is that due to the restriction enzyme, the position of examining is limited by that of restriction enzyme; also, the insufficient restriction enzyme could produce false positive. In addition, the deactivation of tumor suppressor genes due to methylation of promoter CpG islands is closely related to the occurrence of tumor, so that it is important to learn the changes of methylation of tumor suppressor gene CpG islands .
Procedures of RLGS [41–44]: Specimen DNA is digested through landmark enzyme which is crucial to RLGS. In general, methyl-sensitive Not I (GC↓GGCCGC) or Asc I (GG↓CGCGCC) is used due to low cutting rate and at least two available CGP positions. After cutting, the end is marked with 32P-dCTP and 32P-dGTP. Also, digestion is conducted through EcoRV (methyl insensitivity and cutting rate are higher than that of Not I) to produce the shorter Not I-Eco RV segment, which is extended via 1D electrophoresis. Then the segment is digested within the gel via Mbo I, so as to further produce the shorter Not I-Mbo I segment, which is extended in 2D electrophoresis. Then RLGS map is made. Analysis of virtual RLGS: for the organism with fully interpreted genome, RLGS figure could be independently simulated to make comparison with the real RLGS to identify the stain (for details, see Ref. ). As compared, the lost or subdued signal points in the specimen mean CpG islands with high methylation (excluding the loss of DNA). To the contrary, the newly appeared or enhanced signal points mean CpG islands with low methylation (excluding the DNA augment).
The potential application could include (1) polymorphism screening and gene mapping research, (2) mark gene research, (3) research on gene structure and methylation of cancer tissue and clone mouse, and (4) varietal appraisal of crops.
Peripheral Blood DNA as Well as Related Microsatellite DNA and Tumor-Specific DNA (RNA)
Peripheral blood DNA, also called as free DNA, plasma DNA, or serum DNA, is composed of double-bond DNA, single-bond DNA, or the mixture thereof. It exists by means of free DNA and DNA-protein. The circulated DNA level in the healthy human body is very low, accounting for around 3.6–5.0 ng/ml which may be sourced from death of cells. Most of DNA segments are less than 180 bps . The cytology research shows that the above DNA segments are found through the cultivation liquid which is induced by the apoptotic cells, while most of DNA segments are more than 10,000 bps for the cultivation liquid induced by the necrobiosis. For the tumor patients, the content of peripheral blood DNA is always increased; also, the component hereof is complicated. This is related to the physiological parameters of pathology. It is said that this may be caused by the death of tumor cells, splitting of circulated tumor cell or focal transfer, or DNA released by the tumor cell to peripheral circulation such as shedding of protein on the surface [46–48] (Fig. 5.2).
The common measuring method for peripheral blood DNA includes the total content of peripheral blood DNA; category and content of tumor metastasis-related microsatellite DNA and tumor-specific DNA (RNA) and percentage thereof to total content of DNA. Changes of above factors often occur earlier than the tumor markers such as AFP and CEA. Therefore, they are important to the early tumor diagnosis, tumor metastasis after operation, and curative effect assessment. Gabriella et al.  tested DNA from 43 bottles of healthy human plasma and 84 bottles of human plasma from non-small-cell carcinoma of the lung patients, so as to find out that the plasma DNA concentration in the control group was 18 ng/ml, while the plasma DNA of lung cancer patients at Ia and Ib, respectively, reached 320 ng/ml and 344 ng/ml. According to the follow-up survey on 38 lung cancer patients with pneumonectomy, the mean concentration of plasma DNA for 35 patients without recurrence was 34 ng/ml, while for the three patients with increased plasma DNA for 2–20 times after operation, two patients died from hepatic metastasis after the operation and one patient suffered from partial recurrence after 2 years. Oliver et al.  adopted real-time fluorescent quantization PCR to test plasma DNA of 46 healthy humans and 185 patients with non-small-cell carcinoma of the lungs before or after chemotherapy, so as to find out that circulated DNA concentration of patients with steady conditions was decreased as compared with that before treatment, while plasma DNA of patients with exacerbation was increased. Chao et al.  conducted dynamic monitoring on circulation of plasma DNA of the cancer patient after chemotherapy to discover that the plasma DNA was temporarily raised within 2 weeks after the beginning of chemotherapy. Afterwards, plasma DNA was under stabilized descent.
Good Separating of Complicated Proteomics and Examining of Low-Abundance Protein
Two-Dimensional Electrophoresis (2DE)-Based Protein Expression Spectrum
The protein separating in 2DE includes the isoelectrofocusing electrophoresis of which the protein mixture is separated as per the ups and downs of isoelectric point along the first direction as well as the SDS-PAGE electrophoresis of which the protein mixture is separated as per the size of molecular weight along the second direction. 2DE display method is used in gel staining (Coomassie Brilliant Blue), metal reagent (silver staining) or total protein staining, sugar protein, or phosphorylated protein. Also, the protein could be diverted to the membrane through Western blot, so as to conduct the immunology testing or other analysis. The currently used gel scanning equipment is the density scanner, phosphor screen, or fluorescent scanner. Also, image analysis software (Gel Image and PDQuest) is used to conduct the analysis such as protein spot searching, quantifying, background deduction, and punctual matching. In this case, the protein with differential expression could be discovered.
Difference in Gel Electrophoresis (DIGE) : On the basis of traditional two-dimensional gel electrophoresis, the multi-fluorescence analysis is combined to jointly separate a plurality of samples with different fluorescent labels. The fluorophore used for labeling all belongs to the same class with similar molecular structure; also, the molecular weight thereof is the same and provided with positive charge, so that as reaction with remained lysine of peptide chain, all the samples could be transferred to the same position, so as to substantially improve the accuracy, dependability, and repeatability of the result. The sensitivity of such method could compare with that of silver staining and SYPRO Ruby. Here, the protein of 100–200 pg could be observed.
Non-2DE-Based Protein Expression Spectrum
Two-dimensional liquid phase chromatography and mass spectra (2DLC-MS) : the multidimension chromatographic resolution could be used with the mass spectra to make up for the display deficiency of two-dimensional electrophoresis and mass spectra to the protein with low abundance, hydrophobicity, alkalinity, and maximum and minimum. Here, the mixed proteolysis is properly under chromatographic separation. Then MS/MS analysis is made for peptide segment so as to realize the albumin evaluation (also called Shotgun). So far, the most representative analysis is strong cation exchange (SCX)-reverse phase (RP). The peptide segments are firstly divided into groups on the SCX column as per electrostatic interaction. Then, by means of gradient, the components from SCX column are directly sampled on the reversed-phase column. According to the acting force of peptide segment and hydrophobic interaction, the above components are eluted by the mobile phase from chromatographic column. Finally, the mass-spectrometric detection is used to test the peptide segment.
Surface-enhanced laser desorption ionization time-of-flight mass spectrometry (SELDI-TOF-MS) [54, 55] is composed of protein chip, flight mass spectrum, and analytical software. The protein chip is divided into chemical surface and biological surface. Also, the protein chip with chemical surface is divided into lyophobic surface and water-wetted surface, weak cation and strong anion exchange surface, metal ion coupling surface, and specific combination surface. The protein chip with biological surface is divided into antigen-antibody, reception body-lagan, DNA-protein, and enzyme. The chip with chemical surface adopts fewer specimens and could be directly used in the analysis on serum, body fluid, and urine. So, it is easy for automatic operation under high flux reactor to evaluate well on the hydrophobic albumin with low abundance. For the currently appeared combination of ClinProt’s liquid chip with mass spectrum, peptide series could be detected directly.
Labeling technique-based protein expression spectrum: (i) isotope-coded affinity tags (ICAT)  is based on the use of cold labeling reagent and unique analytical apparatus. The structure of tagging reagent is the biotin tag joint reactive group. Here, biotin tag is designed to separate the peptide segment. The joint is divided into two types: tritium labeling for joint in D8-ICAT and hydrogen (unlabeled) for the joint in D0-ICAT. The reactive group could be connected with SH of remained cysteine in peptide segment. ICAT reagent (D8 or D0) is reacted with the equal protein to be analyzed, to achieve ICAT albumin which is under proteolysis after equal blending to achieve the mixture of peptide segment. As purified by avidin column, the mixture of peptide segment could form ICAT marked peptide. Also, molecular weight and intensity (D0/D8) of D8-ICAT and D0-ICAT peptide could be analyzed by MS to arrive the difference of the same peptide segments in different specimens. Moreover, MS-MS could be used to conduct sequencing analysis on peptide with differential expression. (ii) Amino acid-coded mass tagging (AACT)  is also called SILAC of which the fundamental principle is the same as that of ICAT. The major difference is that AACT is the cold tagged amino acid in the cell culture group to be analyzed, i.e., based on biosynthetic method, the synthetic protein could be provided with grade tag. Then, the cells incubated by normal amino acid are blended equally and under enzymolysis. Finally, MS analysis is used to analyze the difference of the same peptide segments in the different specimens. This technique could improve the accuracy of MALDI TOF/TOF mass spectrometer to sequential analysis of peptide segment. (iii) Isobaric tags for relative and absolute quantization (iTRAQ) : the fundamental principle is the same as that of ICAT. However, the adopted marker joint is made up of reporting group (114–117), balancing group (31–28), and amino acid reactive group. Here, the total mass number is the same. Then, the different protein specimens shall be marked. The proportion of specific protein to each specimen could be obtained after the composite specimen is analyzed by the mass spectrum. As per the rule of analysis and difference of protein, the protein with differential expression could be made available for future research.
Protein Evaluation and Characteristic Analysis Thereof
Western blot 
Evaluation on proteinic expression: in order to avoid the false positive of protein with differential expression and increase the confidence level of observed data, the protein shall be transferred to the solid-phase supporter, i.e., membrane (nitrocellulose filter or PVDF membrane), for immunodetection, staining, and other solid analysis as the SDS-PAGE is over. The modifiable albumin such as sugar protein could be tested through development process, combined techniques of agglutinin, and sugar protein fluoroscopic examination. The testing level depends on the glycosyl level of protein. In case of phosphorylated protein, the antibody of serine phosphate, phosphothreonine, and phosphotyrosine could be adopted for testing.
It is based on antibody-antigen reaction; the immunoprecipitation is divided into immunoprecipitation, joint immunoprecipitation, and tandem affinity purification (TAP). The basic steps of joint immunoprecipitation and immunoprecipitation are the same, but not the difference in the pretreatment. TAP could be used in the purification of protein complex. The principle thereof is that the gene (bait protein) is connected to two tagged genes to form the fusion gene which stains saccharomycete or mammal cell strain. Such fusion is under quadratic chromato-purification with the coupled albumin glue column which combined with the tagged protein, so that research could be conducted for the separation and purification of protein complex under approximate natural conditions as well as the succeeding protein-to-protein interaction .
Evaluation on protein functions
It is mainly based on genetic manipulation, including gene compensation and gene deletion. For the gene compensation, the expression vector of protein gene is constructed and transferred, and then the gene expression control, cellular metabolism, and change of cell behavior are observed; The gene was knocked in to form the transgenic animal or organism model, so as to learn about various changes of transgenic borganism as a whole. The gene deletion includes the application of antisense nucleic acid, RNA interference (RNAi), RNA enzyme, and single nucleotide bonded with the gene promoter. At the transcriptional and posttranscriptional levels, the gene expression is “cancelled and reduced” or knockout is used to produce the organism with gene defect. Here, research is conducted on the gene control, signal method, and metabolism of cell and organism “cancelled and reduced.”
Research on Tumor Protein Molecule Marker of Serum and Histiocyte
The tumor serum marker is divided into two types: one is the spontaneous antibody produced due to tumor antigen’s stimulation on the immunity of organism, and the other is the albumin tagged molecule which is derived from the tumor and closely related to the development of tumor. In view of these two tumor serum markers with different properties, SERPA and two-dimensional gel electrophoresis-mass-spectrometric technique in proteomics could be respectively adopted to screen the tumor-related antibody and albumin in the serum.
Serum proteome analysis (SERPA)
It is a new technique formed via the combination of proteomics with immunology, so as to achieve the high flux reactor screening and evaluation of tumor antigen and antibody thereof [61, 62]. The fundamental principle of SERPA is that two-dimensional gel electrophoresis is used to separate the tumor tissue or total protein of cell, which is under Western blotting. Then it is hybridized with the sero-immunity of tumor patients to realize the color rendering. In this case, the tumor antigen could be ascertained when the reaction points on the two-way gel are evaluated by the mass spectrum. For this technique, there is no need to create the expression library, so that a lot of serum specimens of the patients could be analyzed. Meanwhile, the frequency of tumor antibody occurrence could be calculated. It is more important that the modificatory proteantigen after translation could be found. Therefore, since this technique is invented, it is immediately used in the screening and judgment of various tumor antigens for kidney cancer, lung cancer, and breast cancer [63, 64]. Le Naour  adopted this technique to find that eight kinds of albumin are provided with the specific tumor antibody in the serum of over 10 % liver cancer patients.
Two-dimensional gel electrophoresis-mass spectroscopy of serum and tumor-related albumin tagged molecule: the two-dimensional gel electrophoresis-mass spectroscopy of serum faces a lot of difficulties. Firstly, the abundance of albumin in the serum could have the large divergence of quantity degree of 1012. For example, albumin and immunoglobulin could account for 60–97 % of total serum protein, while the potential albumin acting as the disease-related marker only accounts for less than 1 % thereof [65, 66]. So, it is crucial to remove the albumin with high abundance in the serum, so as to conduct the two-dimensional gel electrophoresis-mass spectroscopy for the serum. The other difficulty of two-dimensional gel electrophoresis-mass spectroscopy of serum is that the big individual variation exists among the serum specimens. Therefore, the serum in the same group could be firstly mixed to ensure the dependability of difference among the groups. Finally, Western blotting could be adopted to further verify the screened albumin markers, so as to guarantee the reliability of the result.
SELDI-TOF-MS: it is widely used in such fields as tumor, new drug development, infectious disease, and mental sickness. It could refer to the research result of tumor diagnosis from Lancet in 2002 and the early diagnosis  jointly launched by FDA and NCI on oophoroma. As compared with the traditional CA 125 index (positive predictor only accounts for 35 %), a plurality of indices for protein fingerprint reaches the sensitivity of 100 %, and positive predictor achieves 94 %. Now this method is already used in small-cell carcinoma of the lung , prostatic cancer , kidney cancer , breast cancer , and neck tumor . Currently, this technique has been introduced into China, e.g., it is used in carcinoma of urinary bladder , glioma, pancreatic cancer, blood diseases, and chronic liver diseases, including hepatitis, hepatocirrhosis, liver cancer, and metastasis and recurrence . However, such measurement has strict requirements for the specimen. Also, the limitations exist due to the instable system and subsequent software analysis.
Selection of tumor biological marker via the proteomics of cell and tissue
Comparative proteomics compares dynamic variation and divergence of protein expression at each stage of tumor tissue and normal tissue, tumor tissue, and cancer peripheral tissue or disease as well as specifies the modified conditions, tissue distribution, tissue specificity, and testing sensitivity. Meanwhile, the review and foreseeable research shall be substantially conducted to test the probability of protein molecule as the tumor marker. Also, the comparison could be made with the serum proteomics. Now HSP 27 is exemplified herein. Two-way gel electrophoresis (2DE) is used o separate the HCC tissue with metastasis and six HCC tissue protein without metastasis. In addition, 16 albumin points with significant difference could be tested, including S100 Ca-binding protein (S100), HSP 27, and keratin 18 (CK 18). It is verified that the expression level of HSP 27 is closely related to the latent energy of liver cancer metastasis . The research on serum proteomics further verified that HSP 27 could be used as the potential liver cancer-related biological marker .
In terms of medical service, the proteomics could be helpful for the research on pathogenesis, early diagnosis, and treatment of human diseases. Based on the comparison, the analysis could be made on the differential expression of entire protein within the normal tissue and abnormal histiocyte as well as the cell at the different phases of disease. The evaluation and quantitative analysis could be conducted for the albumin with differential expression to find out the new markers which are related to the disease, so as to offer new methods and basis for the human disease study. Also, the target spot for oncotherapy could be made available.
5.2 Polymolecular Classification Model of Tumor Molecular Marker
5.2.1 The Expression Difference of Molecular Marker
According to the comparative study, the expression difference of gene, protein, or metabolite could be obtained under different pathological conditions, e.g., the molecular marker with expression difference, which shall follow the conditions below.
126.96.36.199 Verification on the Expression Difference of Molecular Markers
RT-PCR or real-time PCR is used to study the expression of mRNA through semiquantitative or quantitative method, or Western blot or immunohistochemistry (including chip technology), and immunofluorescence cytochemistry to study the expression of protein level and verify the expression results of chip.
188.8.131.52 Bioinformatic Analysis on Expression Difference of Molecular Markers
Cluster analysis, PPI analysis, database-oriented Meta-analysis or HTML-based aggregate analysis is commonly used in the early diagnosis, prediction and prognosis of potential biological marker.
5.2.2 Methods for Ascertaining the Tumor Biological Marker
Chip technologies of genome and transcriptomics as well as mass-spectrometric technique of proteomics and metabonomics are the commonly used real-time test methods with high flux reactor. Such test methods are core to the screening of disease molecular markers. Here, the mass data are sorted out through bioinformatic parsing technique. The establishing of disease classification model shall follow the steps: normalization of data, selection of features and sorting algorithm, and mathematical model testing.
184.108.40.206 Normalization of Data
The influencing factors such as “technique” and “biology” may exist during the experiment. So, the normalization of raw data shall be conducted before the comparison of various observed data, so as to reduce the difference among the experiments.
For gene chip test, “housekeeping gene” is always used as the control point, so that the proportion of control point to sample point could be adopted to reduce the error of “technique.” For the error of “biology,” it could be optimized by the replicated experiment. For specific calculation, the normalization of gene chip data is always realized through lowness. Afterwards, SAM software (significant analysis of microarray) is used to sort out the differential expression of gene and conduct the cluster analysis thereof.
During the mass-spectrum test, the later processing of data is relatively complicated. As the peptidome-based original spectrogram is obtained, alignment of the spectra shall be firstly conducted. Here, the intensity of the same peak in the same sample shall be kept in line during various measurements. Besides the built-in software kit of commercialized software, the software with more universality on the file format is developed by some study team, so as to overcome the compatible problems of analytical software . After the spectrogram is calibrated, denoising and normalization shall be still carried out. The denoising includes removal of substrate, electronic jamming, and random ion motion as well as calibrating of spectrogram baseline [78, 79]. Normalization shall remove the systematic error caused by specimen or instrument. In general, the average value or median of adopted peak is for reference . Then, the mass-to-charge ratio and intensity of each peak could be effectively measured. Next, the Biomarker Pattern Software (BPS) is mostly used to sort out the difference of peaks.
220.127.116.11 Selection of Features or Sorting Algorithm
Based on gene chip technology and mass-spectrometric technique, the researcher could obtain a lot of gene and peptide expression-related data. If one or more expressions are obviously different in various specimens, such gene- or peptide-based disaggregated model may boast very strong discriminability in diagnosis or prediction of disease. The chosen marker (also called property) is generally provided with following features: pathological meaning is available for discrimination or classification of disease. Also, the interactive messages are made available among the properties. So, the number of properties shall be reduced as far as possible to achieve high efficiency. Therefore, the selection of markers plays a vital role in the accuracy of disease classification model.
The existing feature selection algorithm is divided into two types:
Filter: the properties are sorted, so that several properties at the highest rank are chosen. Wrapper: sorting algorithm is embedded into the selective process of features, so that the results of classification are the selection criterion which is observed to choose the best feature subset. For the studies on multifactor cancer disaggregated model, the Wrapper is commonly used for feature selection.
The sorting algorithm means that the targets to be identified are sorted as certain category in the feature space via some computational methods. The elementary operation is that the training samples are used to ascertain and optimize the sorting algorithm. Thus, such algorithm could reach the highest precision ratio in the training sample set, so as to obtain the related disaggregated model; then, the above disaggregated model could be used to sort out the specimen. Currently, the multifactor cancer disaggregated model, particularly SELDI-TOF-MS data-based disaggregated model, adopts decision tree [81–92], which has fewer nodes and is subject to the specific peak, so as to prompt the further research on the single molecular marker. The other major merit of decision tree is that the composite sample with different properties or even the numerical value or nonnumerical value-based composite sample could be processed. Therefore, as the SELDI-TOF-MS and existing clinical indices are used as aggregate analysis, the strong operability is available. Other common sorting algorithms include artificial neural network [93, 94] and support vector machine . The former is suitable to a lot of specimen but may suffer from “over-study,” thus causing a large gap between training set and test set; the latter is based on stricter mathematical theory and has the overall optimality, but it is more suitable to small-scale specimens; also, only two kinds of classified algorithm are available .
For the study on gene chip-based disaggregated model, prediction analysis for microarrays (PAM) [97, 98], nearest mean , classifier of nearest centroid , k-nearest neighbor [100, 101], log linear , multidimensional ranking , and compound covariate predictor  are adopted besides decision tree , artificial neural network, and support vector machine [103, 105] as per stated above. It is difficult to judge the superior and inferior of different sorting algorithms in terms of mathematical foundation. The appropriate method is that based on the same sample set, the results of different sorting algorithms are compared to choose the most appropriate disaggregated model [95, 97] of special incident; or, as per the mutual authentication among different algorithms, the model with highest precision ratio could be created for the classification of special incidents.
18.104.22.168 Test on Disease Classification Model
The multiple regression analysis [106, 107], ROC tracing analysis [101, 103, 108], foreseeable verification [24, 97, 102, 109], and review verification are usually adopted for the test on a disease classification model.
Multiple regression analysis is the analytic method used for studying the correlation between dependent variable (diagnostic summary) and various arguments (molecular markers) as well as among various arguments. For tumor diagnosis, the contributions of each molecular marker to the function of disaggregated model as well as the correlation between the molecular markers could be learned through the multiple regression analysis. This will guide the future research on the development, metastasis, and recurrence of tumor as well as prognosis and survival rate. As the most used regression algorithm, logistic and Cox adopt the method of maximum likelihood for parameter estimation. Logistic is suitable to the dependent variable belonging to grouped data, so that quantitative analysis and research could be made for the influence of each factor to the dependent variable; Cox is mainly used in the survival analysis, so as to effectively analyze such special dependent variable (survival time of patient). The multiple regression analysis could be realized through various types of common computer software such as SPSS and Excel.
ROC, the abbreviation of receiver operating characteristic, is a sensitivity- and specificity-based analytic method used to reflect the accuracy of disaggregated model via “area under the curve (AUC)”. Also, it could be quantitative method used to evaluate the contributions of single molecular tag to the classified diagnosis model and polymolecular aggregate analysis, so as to improve the overall efficiency. The operation thereof could be achieved by SPSS and Excel.
For disaggregated models of cancer, double-blind regression of sample set is widely used, e.g., the comparison between overall survival, OS, and disease-free survival (DFS) in the life table. For the tumor metastasis research, comparison of tumor metastasis rate is taken into special account. Also, foreseeable verification on the survival rate of patient is reported. However, due to the difficulty in actual operation, the verification on cancer prediction of disaggregated model via follow-up survey on high-risk population is still not reported. Yet, it is necessary to carry out the foreseeable research on the polymolecular model which is useful in the prediction, diagnosis, and clinical outcome or prognosis.
5.3 Colorectal Cancer and Hepatic Metastasis Molecular Marker Thereof
5.3.1 Common Tumor Molecular Marker of Colorectal Cancer
The sick rate of colorectal cancer in China has been rising. The 5-year survival rate thereof only accounts for 50 %. CEA and carbohydrate antigen (CA 199) are the two common colon cancer markers, which are mainly used in evaluating the curative effect and monitoring the recurrence of tumor at the late period. So, they do not produce the major significance to the screening of colon cancer at the early stage. Currently, the serological diagnosis indices with high sensitivity and specificity for diagnosis of colorectal cancer are unavailable. Therefore, it is necessary to find out the new tumor marker.
Shiwa et al.  discovered the protein with the molecular weight of 12KD in the cell strain of colon cancer. Here, the mass-spectrometric technique is adopted to identify the protein of 12KD as α-prothymosin, which may be the biological marker to diagnose the colon cancer. Lawrie et al.  analyzed the proteomics for cell line LIM 1215 of colon cancer, so as to identify 92 membrane proteins and offer the “target ion” to evaluate albumin with low abundance. Simpson et al.  also analyzed LIM 1215 and established membrane protein database, so as to further study the development of colorectal cancer.
Studying on cell line HCT 116 of colon cancer with high metastasis, Ahmed et al.  discovered that plasma urokinase plasminogen activator (uPA) and the reception body (uPAR) thereof may be the significant factors which cause the deterioration or metastasis of colon cancer, so they not only help in establishing the signaling molecule proteomics database for uPAR but also become the new method for diagnosis and treatment of colon cancer. Stierum et al.  analyzed the Caco-2 proteomics of colorectal cancer cell line and detected 11 kinds of protein related to the reproduction and disintegration hereof. The research shows that FABL, CH 60, GTA 1, TCTP, and NDKA albumin are closely related to the colorectal cancer. Thus, it will be helpful to verify the molecular mechanism concerning the occurrence and development of colorectal cancer.
Xu et al.  used SELDI protein chip to analyze the serum specimen of colorectal cancer patient, so as to set up seven models. Each model is made up of a plurality of distinctive albumin peaks. The precision ratio in phase for the patient before operation reached 78.72 % to the minimum and 86.67 % to the maximum. Roboz et al.  chose the hydrophobicity chip (H4) to find out albumin of m/z 8942 shows high expression and albumin of m/z 9300 shows low expression. Also, Petricoin et al.  made the comparative study on colorectal cancer and polypus to find a 13.8 × 103 protein, which is expressed via both colorectal cancer and polypus. So, it is meaningful to the early screening of colorectal cancer.
Friedman analyzed 12 specimens for colorectal cancer tissue and normal tissue to obtain more than 1,500 distinctive albumin points. As per the mass-spectrum evaluation, 52 kinds of protein with abnormal expression, including cytokeratin, annexin IV, creatine kinase, and fatty-acid-binding protein, are found, so as to greatly enrich the protein database of colorectal cancer tumor .
Chaurand et al.  analyzed mucosa of both normal colon and colon cancer to find out that 100A8, S100A9, and S100A11 in the Ca-binding protein family of cancer tissue were increased, so as to prompt that these three kinds of protein were the specific markers of colon cancer.
Stulik et al. [120, 121] found that the content of EF-2, Mn-SOD, and nm 23 was particularly high in colon cancer; also, the changes of nine kinds of protein were the same in the cancer tissue and adenoma tissue, namely, expression decrease of L-psoriasis-related albumin and carbonic anhydrase and expression increase of S100A11, PPIASE alkalinity mutant, attached element III and VI, DDA H, CK 18, and inhibin, so as to demonstrate the correlation between change of these albumin and development of colorectal cancer.
Roblick et al.  analyzed the specimens of normal tissue, adenoma tissue, cancer tissue, and tumor metastasis tissue of sigmoid carcinoma patient via 2DE, peptide mass fingerprinting, PMF, and MS and carried out the comparison for the individual patient as well as between the patients, so as to find out the abnormal expression of 112 albumin points, among which 72 proteins were evaluated. Here, 46 were increased, but 26 were decreased.
Pei Haiping et al.  found that apolipoprotein A1 with differential expression, calreticulin precursor, glutathione-s transferring enzyme (GST-s), liver-type fatty-acid-binding protein, and heat shock protein 27 could be chosen as the candidate biological markers for early diagnosis of colorectal cancer.
An Ping et al.  found that loss of calmodulin, DNase 262 precursor protein, and α-mannosidase and the increase of apolipoprotein are related to the occurrence of colorectal cancer and hepatic metastasis.
Tachibana et al.  conducted proteome analysis on primary tumor and metastatic tumor of colon cancer so as to obtain Apo A1 (apolipoprotein A1). Also, as per the further research on RT-PCR and immunohistochemistry, expression of Apo A1 in the primary tumor is much lower than that in metastatic tumor. Expression of Apo A1 is related to the pernicious degree of colonic adenocarcinoma. Therefore, Apo A1 could be chosen as the potential marker for enhancement of tumor invasiveness.
5.3.2 Molecular Markers for Hepatic Metastasis of Colorectal Cancer
The hepatic metastasis of colorectal cancer pertains to secondary or metastatic liver cancer. Therefore, the research on molecular markers thereof shall include gene level and genomics, protein expression and proteomics, and immunohistochemistry and also the synthetic study on gene, protein, clinical patho- and physiological index, and biostatistics.
22.214.171.124 Research on Gene Level and Genomics
For the research on polygene chip, Lin et al.  adopted whole genomics chip, statistical analysis, and significance analysis of microarrays (SAM) to analyze 48 cases of primary colorectal cancer and 28 cases of hepatic metastasis, so as to identify 778 genes with differential expression in primary tumor and metastasis hereof. The genetic analysis shows that as compared with primary tumor, tissue remodeling and immunological reaction-related genes were increased during metastasis, while reproduction- and oxidative phosphorylation-related genes were decreased. The real-time PCR demonstrated that the increase of osteopontin, versican, ADAM 17, CKS 2, PRDX 1, CXCR 4, CXCL 12, and LCN 2 with differential expression as well as tissue remodeling and immunological reaction-related genes is associated to the transfer of invasiveness to the new location. The above genes could facilitate the growth of tumor. However, the decrease of reproducing-related genes demonstrated that as compared with primary tumor, the reproducing in metastasis was reduced.
As per the analysis on gene expression spectrum at different phases of metastatic colorectal cancer and nonmetastatic colorectal cancer, TGF-β inhibitor BAMBI is only increased in nearly half of metastatic primary tumor and metastatic carcinoma in terms of 115 gene tags with differential expression . Also, BAMBI inhibited the channel of TGF signal B and increased the transfer of cancer cell; β-catenin co-activated BCL 9-2 in channel Wnt. Gene expression of BAMBI could be used to predict the metastasis. Meanwhile, it was reported that the expression of FGF-1 and FGF-2 in various cancers was related to the harmful prognosis of the tumor patients. Sato et al.  used quantitative and real-time reverse transcription PCR to make the comparison between 202 cases of colorectal cancer tissue and associated normal tunica mucosa, so as to find out expression of FGFR-2 was decreased. The analysis on the relation between clinical patho-characteristics and gene showed that the increase of FGFR-1 in hepatic metastasis was related to the hepatic metastasis.
The non-chip gene expression is adopted to study the tumor metastasis-related genes. For example, MMP-7 from cancer cell participates in invasiveness metastasis of tumor cell via destroying the basilar membrane. The epidemiology shows that the increase of IGF-1 is related to colorectal cancer. Oshima et al.  adopted RT-PCR to study MMP-7, IGF-1, IGF-2, IGF-1R, and β-actin mRNA of the cancer tissue and nearby normal tunica mucosa in 205 cases of untreated colorectal cancer: gene expression of MMP-7 and IGF-1R was increased, and gene expression of IGF-1 was decreased; IGF-1R was related to the invasiveness of vein and hepatic metastasis, so that they were the useful prediction indices for hepatic metastasis of colorectal cancer.
The research shows that the transcription factors EphA 4 and EphB 2 participate in the occurrence and development of various cancers. Oshima et al.  adopted RT-PCR and clinical pathology to analyze the specimens of cancer tissues and nearby normal tunica mucosa in 205 cases of untreated colorectal cancer, so as to find out the increase of EphA 4 and decrease of EphB 2 were related to hepatic metastasis. However, the correlation is unavailable between the gene expression of EphA 4 and that of EphB 2. Here, the increase of EphA 4 and decrease of EphB 2 could be used to predict the hepatic metastasis of colorectal cancer.
Akashi et al.  adopted inverse transcription of PCR to study CEA mRNA in the leading venous blood before the resection in 80 cases of colorectal cancer treatments: 80 % (28/35) CEA mRNA were positive and free from hepatic metastasis. According to Cox risk model, the lymphatic metastasis was the only factor to predict the recurrence of hepatic metastasis. However, the research did not demonstrate CEA mRNA in the leading venous blood was provided with high prediction during hepatic metastasis, but the cancer cells in the leading venous blood were three key elements and initial steps of hepatic metastasis.
It is reported that Osteopontin (OPN) in the tumor is the phosphorylated protein which is related to the occurrence of tumor. As per the study on transcription of colorectal cancer, Rohde et al.  found the high expression of OPN genetic transcription. Also, real-time reverse transcription of PCR, multivariate analysis, and immunohistochemistry were adopted to analyze 13 cases of normal colon cancer tissues, nine cases of adenoma, 120 cases of primary colorectal cancer, and ten cases of hepatic metastasis, so as to discover the remarkable increase of OPN in the primary colon cancer and hepatic metastasis.
Rubie et al.  adopted Q-RT-PCR and ELISA to analyze six cases of UC, eight cases of colorectal adenoma (CRA), 48 cases of colorectal cancer at different stages, and 16 cases of colorectal cancer hepatic metastasis (CRLM) simultaneously or at different times. The results showed that IL-8 expression was related to the occurrence and development of colorectal cancer and hepatic metastasis. As compared with CRA and UC, IL-8 in CRC specimen was obviously overexpressed; also, compared with CRA tissue, IL-8 was increased by 30 times; IL-8 has a close relation with tumor grading; in addition, as compared with primary colorectal cancer tissue, expression of IL-8 in CRLM is higher than the normal level by 80 times.
Miyagawa et al.  adopted the end-mark methods of deoxyribonucleoside monophosphate transferring enzyme to analyze the paraffin embedding tissue of 70 cases of colorectal cancer hepatic metastasis after excision, so as to find the number of dead cancer cells and expression of tumor gp 96 affecting the number of CD 83-positive cell at the outlying part of cancer invasiveness. Here, CD 83-positive cell was the key factor to predict the hepatic metastasis of colorectal cancer.
126.96.36.199 Research on Expression of Protein and Proteomics
The high flux reactor and real-time research on proteomics is focused on the rule of dynamic variation under different pathological and physiological conditions, so as to find out the molecules with differential expression and choose disease-related molecular markers. The molecular markers of colorectal cancer hepatic metastasis could be sorted out via the technique of proteomics.
Shi et al.  adopted 35S-methionine and 2DE-MS to conduct the comparative study on synthetic proteome of colorectal cancer (CRC) hepatic metastasis and normal colon mucosa under culture in vitro for 16 h, so as to find out that the main constituent of newly synthetic protein was made up of cytoplasmic protein with low abundance and traditional secreted protein. Thirty two kinds of protein with differential expression were displayed hereby, among which desmocollin-2 was increased, while fibrinogen gamma chain was decreased. Thus, the further research may discover the serum markers of colorectal cancer hepatic metastasis.
Katayama et al.  adopted 2D-DIGE and LC/MS/MS with maleimide CyDye fluorescein labels to study the change of albumin in CRC metastasis (protopathic SW 480 and SW 620 of lymphatic metastasis in the same patient). For in vivo studies on metastasis, two cell lines were injected to the spleen of nude mice, so as to reveal that nine obviously increased albumin were available in SW 620 as compared with SW 480. The test on in vivo metastasis shows that α-enolase and triosephosphate isomerase were related to metastasis of these two cell lines.
Pei et al.  adopted two-dimensional gel electrophoresis of ionization time-of-flight mass spectrometry and immunoblotting to study the fresh tumor and related normal tunica mucosa of non-LNM CRC and LNM CRC. Also, proteomics, tissue chip technology, and immunity histochemical stain were obtained from non-LNM CRC and LNM CRC of 40 CRC specimens with paraffin-embedded technique to identify four proteins with differential expression. There were 25 proteins with differential expression in the normal tunica mucosa and CRC tissue. As compared with non-LNM CRC, heat shock protein-27 (HSP-27), glutathione S-transferase, GST, and Annexin II in LNM CRC were increased, while liver fatty-acid-binding protein (L-FABP) was decreased, so as to prompt the LNM risk in CRC.