Big Data Information Reconstruction on an Infinite Tree for a $4\times 4$-state Asymmetric Model with Community Effects

Liu, Wenjian; Ning, Ning

doi:10.1007/s10955-019-02372-7

Abstract:The information reconstruction problem on an infinite tree, is to collect and analyze massive data samples at the $n$th level of the tree to identify whether there is non-vanishing information of the root, as $n$ goes to infinity. This problem has wide applications in various fields such as biology, information theory and statistical physics, and its close connections to cluster learning, data mining and deep learning have been well established in recent years. Although it has been studied in numerous contexts, the existing literatures with rigorous reconstruction thresholds established are very limited. In this paper, motivated by a classical deoxyribonucleic acid (DNA) evolution model, the F$81$ model, and taking into consideration of the Chargaff's parity rule by allowing the existence of a guanine-cytosine content bias, we study the noise channel in terms of a $4\times 4$-state asymmetric probability transition matrix with community effects, for four nucleobases of DNA. The corresponding information reconstruction problem in molecular phylogenetics is explored, by means of refined analyses of moment recursion, in-depth concentration estimates, and thorough investigations on an asymptotic $4$-dimensional nonlinear second order dynamical system. We rigorously show that the reconstruction bound is not tight when the sum of the base frequencies of adenine and thymine falls in the interval $\left(0,1/2-\sqrt{3}/6\right)\bigcup \left(1/2+\sqrt{3}/6,1\right)$, which is the first rigorous result on asymmetric noisy channels with community effects.

Comments:	arXiv admin note: text overlap with arXiv:1812.06039
Subjects:	Probability (math.PR)
Cite as:	arXiv:1812.10475 [math.PR]
	(or arXiv:1812.10475v1 [math.PR] for this version)
	https://doi.org/10.48550/arXiv.1812.10475
Related DOI:	https://doi.org/10.1007/s10955-019-02372-7

Mathematics > Probability

Title:Big Data Information Reconstruction on an Infinite Tree for a $4\times 4$-state Asymmetric Model with Community Effects

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators