Skip to main content

SuperML Java Advanced Machine Learning Framework

A comprehensive 21-module machine learning framework for Java

21 Modules

Modular architecture with specialized components

AutoML

Automated machine learning with one-line training

Visualization

Professional charts with dual-mode display

SuperML Java Framework

A comprehensive 21-module machine learning framework for Java

SuperML Java Framework

Build Status Performance Tests

A comprehensive, modular machine learning library for Java, inspired by scikit-learn and designed for enterprise-grade applications. Version 2.1.0 features a sophisticated 22-module architecture with production-validated performance delivering 400,000+ predictions per second.

๐Ÿš€ Features

Core Machine Learning (15+ Algorithms)

  • Linear Models (6): Logistic Regression, Linear Regression, Ridge, Lasso, SGD Classifier/Regressor
  • Tree-Based Models (5): Decision Trees, Random Forest (Classifier/Regressor), Gradient Boosting, XGBoost
  • Neural Networks (3): Multi-Layer Perceptron (MLP), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN)
  • Clustering (1): K-Means with k-means++ initialization and advanced convergence
  • Preprocessing: StandardScaler, MinMaxScaler, RobustScaler, LabelEncoder, Neural Network-specific preprocessing

Advanced Features

  • AutoML Framework: Automated algorithm selection and hyperparameter optimization
  • Dual-Mode Visualization: Professional XChart GUI with ASCII terminal fallback
  • Model Selection: Cross-validation, Grid/Random Search, advanced hyperparameter tuning
  • Pipeline System: Seamless chaining of preprocessing and modeling steps
  • High-Performance Inference: Microsecond predictions with caching and batch processing
  • Model Persistence: Save/load models with automatic statistics and metadata capture

Production & Enterprise

  • Cross-Platform Export: ONNX and PMML support for enterprise deployment
  • Drift Detection: Real-time model and data drift monitoring with statistical tests
  • Kaggle Integration: One-line training on any Kaggle dataset with automated workflows
  • Professional Logging: Structured logging with Logback and SLF4J
  • Comprehensive Metrics: Complete evaluation suite for all ML tasks
  • Thread Safety: Concurrent prediction capabilities after model training

โšก Performance Highlights

SuperML Java 2.1.0 achieves exceptional performance across all 22 production modules:

๐Ÿ—๏ธ Build & Deployment Excellence

  • โœ… 22/22 modules compile successfully with zero failures
  • โšก ~4 minute complete framework build (clean โ†’ install โ†’ test)
  • ๐Ÿงช 145+ comprehensive tests pass with full coverage validation
  • ๐Ÿ“ฆ Production JARs ready for enterprise deployment

๐Ÿš€ Runtime Performance Benchmarks

  • โšก 400,000+ predictions/second - XGBoost batch inference optimization
  • ๐Ÿ”ฅ 35,714 predictions/second - Production pipeline throughput
  • โš™๏ธ 6.88 microseconds - Single prediction latency (sub-millisecond)
  • ๐Ÿง  Real-time neural training - Full epoch-by-epoch loss tracking

๐ŸŽฏ Algorithm Performance Validated

Algorithm              Training Time    Accuracy    Test Results
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
XGBoost                2.5 seconds      89%+        โœ… 20 tests passed
Neural Networks        Variable         95%+        โœ… 46 tests passed  
Random Forest          164ms           89%+        โœ… Feature importance
Linear Models          <50ms           72-95%      โœ… 34 tests passed
Cross-Validation       ~100ms          Robust      โœ… 26 tests passed

๐ŸŒŸ Advanced Capabilities Verified

  • ๐ŸŽฒ AutoML: Automated hyperparameter optimization with grid/random search
  • ๐Ÿ“Š Kaggle Integration: Complete workflows from data loading to submission
  • ๐Ÿ’พ Model Persistence: High-speed serialization with automatic metadata
  • ๐Ÿ“ˆ Production Monitoring: Real-time drift detection and alerts
  • ๐Ÿ” Cross-Validation: Parallel 5-fold execution with statistical robustness

All performance metrics validated on comprehensive test suite with real-world datasets.

๐Ÿ“š Documentation

๐ŸŽ‰ Latest Release

Getting Started

Algorithm Documentation

Advanced Features

API & Examples

Development

๐ŸŽฏ Quick Example

import org.superml.datasets.Datasets;
import org.superml.tree.RandomForest;
import org.superml.multiclass.OneVsRestClassifier;
import org.superml.linear_model.LogisticRegression;
import org.superml.metrics.Metrics;

// Load dataset
var dataset = Datasets.loadIris();
var split = DataLoaders.trainTestSplit(dataset.X, 
    Arrays.stream(dataset.y).asDoubleStream().toArray(), 0.2, 42);

// Train multiclass model
var base = new LogisticRegression();
var classifier = new OneVsRestClassifier(base);
classifier.fit(split.XTrain, split.yTrain);

// Or use tree-based model
var forest = new RandomForest(100, 10);
forest.fit(split.XTrain, split.yTrain);

// Make predictions
double[] predictions = forest.predict(split.XTest);
double[][] probabilities = forest.predictProba(split.XTest);

// Evaluate
double accuracy = Metrics.accuracy(split.yTest, predictions);
System.out.println("Accuracy: " + accuracy);

// Train model
var classifier = new LogisticRegression().setMaxIter(1000);
classifier.fit(split.XTrain, split.yTrain);

// Evaluate
double[] predictions = classifier.predict(split.XTest);
double accuracy = Metrics.accuracy(split.yTest, predictions);
System.out.printf("Accuracy: %.2f%%\n", accuracy * 100);

Start your machine learning journey with SuperML Java today! ๐Ÿš€