Customer stories Events & webinars Ebooks & reports Business insights GitHub Skills ...
The algorithm was configured with 100 decision trees and a random seed of 42 to ensure reproducibility. Feature importance was quantified by the mean decrease in Gini impurity ( Gemayel et al., 2024 ) ...
We benchmark on the community-standard Dalke NN dataset (1,000 high-similarity ChEMBL pairs) — the same dataset widely used by RDKit, CDK, and the academic MCS literature. Identical SMILES input, same ...