Using Molecular Simulations and Data Science to Investigate How Zeolites Form and Which Hypothetical Frameworks Seem Feasible
Scott Auerbach
University of Massachusetts Amherst
Even though millions of siliceous zeolite structures have been generated by computer-aided searches, no new hypothetical framework has yet to be synthesized. This inability frustrates efforts to fabricate promising zeolites for new technologies in areas such as carbon mitigation, and raises the fascinating question: what if anything makes real and hypothetical zeolites systematically different? This question has intrigued zeolite scientists for decades; yet, most work to date on the zeolite problem has been limited to assessing intuitive structural descriptors. Here, we tackle this problem through a rigorous data science scheme called the “zeolite sorting hat.” The zeolite sorting hat blends principal component analysis, a linear support vector machine (SVM), and a data-driven thermodynamic stability filter. This approach yields three hypothetical frameworks as promising targets for synthesis, each with a composition suggested by the zeolite sorting hat. The zeolite sorting hat points to 2nd neighbor Si-O distances as the key factor that distinguishes real and hypothetical structures. We emphasize that this finding is an outcome of our study – not an assumed working hypothesis.
The sorting hat workflow was extended to analyze a large dataset of Monte Carlo trajectories that simulate zeolite LTA crystal formation. The Monte Carlo simulations predict (and experiments confirm) that LTA forms faster with two organic structure-directing agents (OSDAs) that match the sizes of the small and large LTA pores ("2OSDA" synthesis), than it does when only using one large OSDA type ("1OSDA" synthesis). A linear SVM was trained to segregate structural microstates from 1OSDA and 2OSDA datasets, suggesting that the 2OSDA dataset is much more structurally homogeneous. Inspired by this finding, we computed the pair entropy for each microstate in the 1OSDA and 2OSDA datasets, finding that the difference in Si-Si pair entropy semi-quantitatively accounts for the speedup in zeolite formation seen in Monte Carlo simulations. Experimental implications for this entropic speedup in terms of high-pressure syntheses are discussed.