PMML Export Guide
SuperML Java 3.0.1 includes comprehensive PMML (Predictive Model Markup Language) export functionality that enables seamless model deployment across different ML platforms and programming languages. Export your trained SuperML models to industry-standard PMML format for maximum interoperability.
๐ฏ Overview
PMML is an XML-based standard that allows machine learning models to be shared between different applications and platforms. SuperMLโs PMML module provides:
- Full PMML 4.4 compliance with proper schema validation
- 6 model types supported: Linear/Logistic Regression, Ridge, Lasso, Decision Trees, Random Forest
- Cross-platform deployment to Spark, Python, R, and enterprise systems
- Production-ready validation and error handling
- Custom feature mapping for business-friendly model descriptions
๐๏ธ Supported Models and Features
โ Fully Supported Model Types
| Model Type | PMML Element | Key Features |
|---|---|---|
| LinearRegression | <RegressionModel> |
Coefficients, intercept, feature names |
| LogisticRegression | <RegressionModel> |
Logit normalization, class probabilities |
| Ridge | <RegressionModel> |
L2 regularization metadata, shrunk coefficients |
| Lasso | <RegressionModel> |
L1 regularization, automatic zero-coefficient exclusion |
| DecisionTree | <TreeModel> |
Hierarchical splits, leaf predictions, feature thresholds |
| RandomForest | <MiningModel> |
Ensemble representation, majority voting, tree segments |
โ PMML 4.4 Features
- Complete Headers: Application metadata, timestamps, model provenance
- Data Dictionary: Feature definitions with data types and operational types
- Mining Schema: Input/output field specifications and usage types
- Model-Specific Elements: Algorithm-appropriate PMML structures
- Validation Support: Schema compliance and correctness checking
๐ Quick Start
Basic Model Export
import org.superml.pmml.PMMLConverter;
import org.superml.linear_model.LinearRegression;
public class BasicPMMLExample {
public static void main(String[] args) {
// 1. Train your SuperML model
LinearRegression model = new LinearRegression();
model.fit(X_train, y_train);
// 2. Create PMML converter
PMMLConverter converter = new PMMLConverter();
// 3. Export to PMML
String pmmlXml = converter.convertToXML(model);
// 4. Validate the PMML
boolean isValid = converter.validatePMML(pmmlXml);
System.out.println("PMML is valid: " + isValid);
// 5. Save to file for deployment
Files.write(Paths.get("model.pmml"), pmmlXml.getBytes());
System.out.println("Model exported to model.pmml");
}
}
Advanced Export with Custom Features
import org.superml.pmml.PMMLConverter;
import org.superml.ensemble.RandomForestClassifier;
public class AdvancedPMMLExample {
public static void main(String[] args) {
// Train a Random Forest model
RandomForestClassifier model = new RandomForestClassifier()
.setNumTrees(100)
.setMaxDepth(10)
.setMinSamplesLeaf(5);
model.fit(X_train, y_train);
// Define business-friendly feature names
String[] businessFeatures = {
"customer_age", "annual_income", "credit_score",
"debt_to_income_ratio", "employment_years"
};
String targetName = "loan_approval_probability";
// Export with custom names
PMMLConverter converter = new PMMLConverter();
String pmmlXml = converter.convertToXML(model, businessFeatures, targetName);
// Validate and deploy
boolean isValid = converter.validatePMML(pmmlXml);
if (isValid) {
Files.write(Paths.get("business_model.pmml"), pmmlXml.getBytes());
System.out.println("โ
Business model exported successfully!");
} else {
System.err.println("โ PMML validation failed!");
}
}
}
๐ง API Reference
PMMLConverter Class
The main class for PMML conversion with comprehensive functionality:
public class PMMLConverter {
// Basic conversion with default feature names
public String convertToXML(Object model)
// Advanced conversion with custom feature names
public String convertToXML(Object model, String[] featureNames, String targetName)
// PMML validation
public boolean validatePMML(String pmmlXml)
// Future: Import from PMML (planned v3.1.0)
public Object convertFromXML(String pmmlXml) // Currently throws UnsupportedOperationException
}
Method Details
convertToXML(Object model)
- Purpose: Converts SuperML model to PMML with default feature names
- Parameters:
model- Any trained SuperML model extending BaseEstimator - Returns: Valid PMML 4.4 XML string
- Throws:
IllegalArgumentExceptionfor unsupported models or untrained models
convertToXML(Object model, String[] featureNames, String targetName)
- Purpose: Converts model with business-friendly feature names
- Parameters:
model- Trained SuperML modelfeatureNames- Array of custom feature names (null for default)targetName- Custom target variable name
- Returns: PMML XML with custom field names
- Use Cases: Business reporting, cross-team collaboration, regulatory compliance
validatePMML(String pmmlXml)
- Purpose: Validates PMML against schema and correctness requirements
- Parameters:
pmmlXml- PMML XML string to validate - Returns:
trueif fully compliant,falseotherwise - Validation Checks: Schema compliance, required elements, data consistency
๐ Model-Specific PMML Structures
Linear Regression Models
Generated PMML Structure:
<?xml version="1.0" encoding="UTF-8"?>
<PMML version="4.4">
<Header>
<Application name="SuperML Java Framework" version="3.1.2"/>
<Timestamp>2025-07-20T22:30:00</Timestamp>
</Header>
<DataDictionary numberOfFields="4">
<DataField name="age" optype="continuous" dataType="double"/>
<DataField name="income" optype="continuous" dataType="double"/>
<DataField name="education" optype="continuous" dataType="double"/>
<DataField name="salary" optype="continuous" dataType="double"/>
</DataDictionary>
<RegressionModel functionName="regression">
<MiningSchema>
<MiningField name="age" usageType="active"/>
<MiningField name="income" usageType="active"/>
<MiningField name="education" usageType="active"/>
<MiningField name="salary" usageType="target"/>
</MiningSchema>
<RegressionTable intercept="25000.50">
<NumericPredictor name="age" coefficient="150.25"/>
<NumericPredictor name="income" coefficient="0.45"/>
<NumericPredictor name="education" coefficient="2500.75"/>
</RegressionTable>
</RegressionModel>
</PMML>
Key Features:
- โ
Coefficient extraction via reflection (
getCoefficients()) - โ
Intercept inclusion (
getIntercept()) - โ Automatic feature name mapping
- โ Continuous target type for regression
Logistic Regression Models
Generated PMML Structure:
<RegressionModel functionName="classification" normalizationMethod="logit">
<MiningSchema>
<MiningField name="feature_0" usageType="active"/>
<MiningField name="feature_1" usageType="active"/>
<MiningField name="prediction" usageType="target"/>
</MiningSchema>
<RegressionTable intercept="0.5" targetCategory="1">
<NumericPredictor name="feature_0" coefficient="0.123"/>
<NumericPredictor name="feature_1" coefficient="-0.045"/>
</RegressionTable>
<RegressionTable intercept="0.0" targetCategory="0"/>
</RegressionModel>
Key Features:
- โ Binary and multiclass classification support
- โ Logit normalization for probability interpretation
- โ
Class labels from
getClasses()method - โ Target categories for all classes
Decision Tree Models
Generated PMML Structure:
<TreeModel functionName="classification">
<MiningSchema>
<MiningField name="feature_0" usageType="active"/>
<MiningField name="feature_1" usageType="active"/>
<MiningField name="class" usageType="target"/>
</MiningSchema>
<Node>
<SimplePredicate field="feature_0" operator="lessOrEqual" value="35.0"/>
<Node>
<SimplePredicate field="feature_1" operator="lessOrEqual" value="50000.0"/>
<Node score="0" recordCount="150"/>
</Node>
<Node>
<SimplePredicate field="feature_1" operator="greaterThan" value="50000.0"/>
<Node score="1" recordCount="75"/>
</Node>
</Node>
</TreeModel>
Key Features:
- โ Hierarchical tree structure representation
- โ Split conditions with thresholds
- โ Leaf node predictions and sample counts
- โ Both classification and regression support
Random Forest Models
Generated PMML Structure:
<MiningModel functionName="classification" multipleModelMethod="majorityVote">
<MiningSchema>
<MiningField name="feature_0" usageType="active"/>
<MiningField name="feature_1" usageType="active"/>
<MiningField name="class" usageType="target"/>
</MiningSchema>
<Segmentation multipleModelMethod="majorityVote">
<Segment id="tree_0">
<True/>
<TreeModel functionName="classification">
<!-- Individual tree structure -->
</TreeModel>
</Segment>
<Segment id="tree_1">
<True/>
<TreeModel functionName="classification">
<!-- Individual tree structure -->
</TreeModel>
</Segment>
<!-- Additional trees -->
</Segmentation>
</MiningModel>
Key Features:
- โ
Ensemble representation using
MiningModel - โ Individual trees as segments
- โ Majority vote for classification
- โ Average aggregation for regression
๐ Cross-Platform Deployment
1. Apache Spark MLlib โก
Deploy SuperML models in Spark environments:
import org.jpmml.sparkml.PMMLBuilder
import org.apache.spark.ml.Pipeline
// Load SuperML PMML in Spark
val pmmlModel = PMMLBuilder.load("model.pmml")
// Use in Spark MLlib pipelines
val pipeline = new Pipeline()
.setStages(Array(pmmlModel))
val predictions = pipeline.fit(trainingDF).transform(testDF)
Supported Platforms:
- โ Spark 2.4+, 3.x
- โ Databricks
- โ Amazon EMR
- โ Google Dataproc
2. Python scikit-learn Integration ๐
Use SuperML models in Python environments:
from jpmml_evaluator import make_evaluator
import pandas as pd
# Load SuperML PMML in Python
evaluator = make_evaluator("model.pmml")
# Make predictions
test_data = pd.DataFrame({
'age': [25, 35, 45],
'income': [50000, 75000, 100000],
'education': [12, 16, 18]
})
predictions = evaluator.evaluate(test_data.to_dict('records'))
print("Predictions:", predictions)
Python Libraries:
- โ
jpmml-evaluator- High-performance PMML execution - โ
sklearn2pmml- Integration with scikit-learn pipelines - โ
pandas- Data manipulation and preprocessing
3. R Statistical Environment ๐
Deploy in R for statistical analysis:
library(pmml)
library(XML)
# Load SuperML PMML in R
model <- pmml::loadPMML("model.pmml")
# Make predictions
test_data <- data.frame(
age = c(25, 35, 45),
income = c(50000, 75000, 100000),
education = c(12, 16, 18)
)
predictions <- predict(model, test_data)
print(predictions)
R Packages:
- โ
pmml- PMML import/export functionality - โ
XML- XML parsing and manipulation - โ
data.table- High-performance data processing
4. Enterprise ML Platforms ๐ข
Deploy to enterprise machine learning platforms:
SAS Enterprise Miner
/* Import SuperML PMML */
proc model data=score_data;
import pmml="model.pmml";
score data=score_data out=predictions;
run;
IBM SPSS
* Import PMML model.
MODEL IMPORT FILE="model.pmml".
* Score new data.
COMPUTE prediction = PMML.PREDICT(age, income, education).
EXECUTE.
Microsoft Azure ML
# Deploy PMML model to Azure ML
from azure.ml import MLClient
ml_client = MLClient()
ml_client.models.create_or_update(
name="superml-model",
path="model.pmml",
type="pmml"
)
Supported Enterprise Platforms:
- โ SAS Enterprise Miner
- โ IBM SPSS Modeler
- โ Amazon SageMaker
- โ Microsoft Azure ML
- โ Google Cloud AI Platform
๐ง Advanced Usage Patterns
Model Versioning and Metadata
import org.superml.pmml.PMMLConverter;
import org.superml.pmml.PMMLMetadata;
// Add custom metadata to PMML export
PMMLMetadata metadata = new PMMLMetadata.Builder()
.version("3.1.2")
.author("Data Science Team")
.description("Customer churn prediction model")
.trainingDate(LocalDateTime.now())
.dataSource("customer_database_v2.1")
.performanceMetrics(Map.of(
"accuracy", 0.934,
"precision", 0.891,
"recall", 0.876
))
.build();
PMMLConverter converter = new PMMLConverter();
String pmmlWithMetadata = converter.convertToXML(model, featureNames, targetName, metadata);
A/B Testing Deployment
public class ABTestingDeployment {
public void deployModelsForTesting() {
// Train baseline model
LinearRegression baselineModel = new LinearRegression();
baselineModel.fit(X_train, y_train);
// Train experimental model
Ridge experimentalModel = new Ridge().setAlpha(0.1);
experimentalModel.fit(X_train, y_train);
PMMLConverter converter = new PMMLConverter();
// Export both models with version tags
String baselinePMML = converter.convertToXML(baselineModel,
features, "conversion_rate");
String experimentalPMML = converter.convertToXML(experimentalModel,
features, "conversion_rate");
// Save with version identifiers
Files.write(Paths.get("baseline_v1.0.pmml"), baselinePMML.getBytes());
Files.write(Paths.get("experimental_v1.1.pmml"), experimentalPMML.getBytes());
System.out.println("โ
A/B testing models deployed successfully!");
}
}
Batch Model Export
public class BatchModelExport {
public void exportMultipleModels(Map<String, Object> models) {
PMMLConverter converter = new PMMLConverter();
for (Map.Entry<String, Object> entry : models.entrySet()) {
String modelName = entry.getKey();
Object model = entry.getValue();
try {
// Export each model
String pmmlXml = converter.convertToXML(model);
// Validate before saving
if (converter.validatePMML(pmmlXml)) {
String filename = modelName + "_model.pmml";
Files.write(Paths.get(filename), pmmlXml.getBytes());
System.out.println("โ
" + modelName + " exported successfully");
} else {
System.err.println("โ " + modelName + " validation failed");
}
} catch (Exception e) {
System.err.println("โ Error exporting " + modelName + ": " + e.getMessage());
}
}
}
}
๐งช Testing and Validation
PMML Validation Examples
import org.superml.pmml.PMMLConverter;
public class PMMLValidationExample {
public static void main(String[] args) {
PMMLConverter converter = new PMMLConverter();
// Test various validation scenarios
testValidation(converter);
testErrorHandling(converter);
}
private static void testValidation(PMMLConverter converter) {
System.out.println("=== PMML Validation Tests ===");
// Valid model PMML
LinearRegression validModel = new LinearRegression();
// ... train model
String validPMML = converter.convertToXML(validModel);
System.out.println("โ
Valid model PMML: " +
(converter.validatePMML(validPMML) ? "PASSED" : "FAILED"));
// Invalid inputs
System.out.println("โ
Null PMML validation: " +
(!converter.validatePMML(null) ? "CORRECTLY REJECTED" : "FAILED"));
System.out.println("โ
Empty PMML validation: " +
(!converter.validatePMML("") ? "CORRECTLY REJECTED" : "FAILED"));
System.out.println("โ
Malformed XML validation: " +
(!converter.validatePMML("<invalid>xml") ? "CORRECTLY REJECTED" : "FAILED"));
}
private static void testErrorHandling(PMMLConverter converter) {
System.out.println("\n=== Error Handling Tests ===");
try {
converter.convertToXML(null);
System.out.println("โ Null model: Should have thrown exception");
} catch (IllegalArgumentException e) {
System.out.println("โ
Null model: Correctly rejected - " + e.getMessage());
}
try {
converter.convertToXML("Not a model");
System.out.println("โ Invalid model: Should have thrown exception");
} catch (Exception e) {
System.out.println("โ
Invalid model: Correctly rejected - " + e.getMessage());
}
}
}
Expected Output
=== PMML Validation Tests ===
โ
Valid model PMML: PASSED
โ
Null PMML validation: CORRECTLY REJECTED
โ
Empty PMML validation: CORRECTLY REJECTED
โ
Malformed XML validation: CORRECTLY REJECTED
=== Error Handling Tests ===
โ
Null model: Correctly rejected - Model cannot be null
โ
Invalid model: Correctly rejected - Unsupported model type
๐ Performance and Monitoring
PMML Export Performance
| Model Type | Export Time | PMML Size | Validation Time |
|---|---|---|---|
| LinearRegression | <1ms | ~2KB | <1ms |
| LogisticRegression | <2ms | ~3KB | <1ms |
| DecisionTree | 5-15ms | 10-50KB | 2-5ms |
| RandomForest (100 trees) | 100-300ms | 500KB-2MB | 10-30ms |
Memory Usage
- Converter Instance: ~2MB baseline
- PMML Generation: 2-5x model size in memory
- Large Random Forests: Use streaming export (planned v3.1.0)
Production Monitoring
import org.superml.pmml.monitoring.PMMLExportMonitor;
public class ProductionPMMLExport {
private final PMMLExportMonitor monitor = new PMMLExportMonitor();
public void monitoredExport(Object model, String modelName) {
long startTime = System.currentTimeMillis();
try {
PMMLConverter converter = new PMMLConverter();
String pmmlXml = converter.convertToXML(model);
// Validate
boolean isValid = converter.validatePMML(pmmlXml);
// Log metrics
long exportTime = System.currentTimeMillis() - startTime;
monitor.logExportMetrics(modelName, exportTime, pmmlXml.length(), isValid);
if (isValid) {
Files.write(Paths.get(modelName + ".pmml"), pmmlXml.getBytes());
monitor.logSuccess(modelName);
} else {
monitor.logValidationFailure(modelName);
}
} catch (Exception e) {
monitor.logError(modelName, e);
throw new RuntimeException("PMML export failed for " + modelName, e);
}
}
}
๐ฎ Future Enhancements
Planned Features (v3.1.0)
- Bidirectional Conversion: Import PMML models back to SuperML
- Pipeline Export: Complete preprocessing + model PMML export
- Streaming Export: Memory-efficient large model export
- Custom Transformations: User-defined PMML extensions
Advanced Features (v3.2.0)
- Neural Network Support: PMML export for MLPs and CNNs
- Transformer Support: Limited PMML export for attention models
- Ensemble Methods: Advanced ensemble PMML structures
- Model Monitoring: Built-in drift detection in PMML
Integration Enhancements
- Cloud Deployment: Direct deployment to cloud ML platforms
- Container Support: Docker images with PMML runtime
- REST API: Web service wrapper for PMML models
- Dashboard: Model registry and deployment tracking
๐ Resources and Examples
Complete Examples
Documentation Links
- PMML 4.4 Specification
- SuperML PMML API Reference
- Cross-Platform Integration Guide
- Performance Benchmarks
Community Resources
The SuperML Java PMML export functionality provides production-ready, cross-platform model deployment capabilities. With comprehensive model support, enterprise-grade validation, and seamless integration across multiple platforms, itโs the complete solution for deploying SuperML models anywhere.
Ready to deploy your models? Start with our Quick Start Guide and explore the examples for your specific use case!