Skip to main content

BlindML

scikit-learn for Encrypted Data

BlindML trains Naive Bayes, decision trees, histogram and logistic regression models run on encrypted records in seconds. Zero plaintext exposure. Your machine learning workflows stay intact, your models get the data they need without violating compliance or security policies.

Free and open source on GitHub
BlindML scene: ghost, robot, and squirrel around a holographic analytics dashboard.

Trained on aggregate counts, not plaintext records

Marginals are cross-tabulated count summaries over encrypted fields. The Blind Insight platform computes them by issuing aggregate queries against ciphertext and returning a count table. The model trains on that table—never on feature vectors derived from decrypted records.

blind_ml — training
from blind_ml import NaiveBayesModel

model = NaiveBayesModel().fit(marginals, n_pos=3201, n_neg=76402)
pred, risk = model.predict({"fraud_type": "card_fraud"})
# 0.0 F1 delta vs. plaintext — 600K records
blind grants create — scope training access
blind grants create --data '{
  "name": "blindml-training",
  "field_names": {"risk_level": true, "fraud_type": true, "is_fraud": true},
  "can_create_records": false
}'

Supported algorithms

Source of truth: blind-insight/blind-ml on GitHub — open source.

  • Naive Bayes Best for multi-class classification tasks over categorical fields with conditional independence assumptions.
  • Decision Trees Interpretable split-based models; useful when explainability or auditable decision paths are required.
  • Logistic Regression Linear decision boundaries; calibrated probabilities for binary classification over structured aggregate features.

The numbers.

0.0 F1 delta
vs. plaintext baseline, 600K records
HIPAA k=11
Suppression built in—F1 holds
Naive Bayes · Decision Trees · Logistic Regression
scikit-learn compatible
No plaintext exposure
Training and inference on aggregates only

Start building on your schema.

Get started at the Build tier, $9/month.