Skip to main content

PMML Export Guide

Complete guide to PMML model export and cross-platform deployment in SuperML Java

PMML Export Guide

PMML Compliance Model Support Cross Platform

SuperML Java 3.0.1 includes comprehensive PMML (Predictive Model Markup Language) export functionality that enables seamless model deployment across different ML platforms and programming languages. Export your trained SuperML models to industry-standard PMML format for maximum interoperability.

๐ŸŽฏ Overview

PMML is an XML-based standard that allows machine learning models to be shared between different applications and platforms. SuperMLโ€™s PMML module provides:

  • Full PMML 4.4 compliance with proper schema validation
  • 6 model types supported: Linear/Logistic Regression, Ridge, Lasso, Decision Trees, Random Forest
  • Cross-platform deployment to Spark, Python, R, and enterprise systems
  • Production-ready validation and error handling
  • Custom feature mapping for business-friendly model descriptions

๐Ÿ—๏ธ Supported Models and Features

โœ… Fully Supported Model Types

Model Type PMML Element Key Features
LinearRegression <RegressionModel> Coefficients, intercept, feature names
LogisticRegression <RegressionModel> Logit normalization, class probabilities
Ridge <RegressionModel> L2 regularization metadata, shrunk coefficients
Lasso <RegressionModel> L1 regularization, automatic zero-coefficient exclusion
DecisionTree <TreeModel> Hierarchical splits, leaf predictions, feature thresholds
RandomForest <MiningModel> Ensemble representation, majority voting, tree segments

โœ… PMML 4.4 Features

  • Complete Headers: Application metadata, timestamps, model provenance
  • Data Dictionary: Feature definitions with data types and operational types
  • Mining Schema: Input/output field specifications and usage types
  • Model-Specific Elements: Algorithm-appropriate PMML structures
  • Validation Support: Schema compliance and correctness checking

๐Ÿš€ Quick Start

Basic Model Export

import org.superml.pmml.PMMLConverter;
import org.superml.linear_model.LinearRegression;

public class BasicPMMLExample {
    public static void main(String[] args) {
        // 1. Train your SuperML model
        LinearRegression model = new LinearRegression();
        model.fit(X_train, y_train);
        
        // 2. Create PMML converter
        PMMLConverter converter = new PMMLConverter();
        
        // 3. Export to PMML
        String pmmlXml = converter.convertToXML(model);
        
        // 4. Validate the PMML
        boolean isValid = converter.validatePMML(pmmlXml);
        System.out.println("PMML is valid: " + isValid);
        
        // 5. Save to file for deployment
        Files.write(Paths.get("model.pmml"), pmmlXml.getBytes());
        
        System.out.println("Model exported to model.pmml");
    }
}

Advanced Export with Custom Features

import org.superml.pmml.PMMLConverter;
import org.superml.ensemble.RandomForestClassifier;

public class AdvancedPMMLExample {
    public static void main(String[] args) {
        // Train a Random Forest model
        RandomForestClassifier model = new RandomForestClassifier()
            .setNumTrees(100)
            .setMaxDepth(10)
            .setMinSamplesLeaf(5);
        model.fit(X_train, y_train);
        
        // Define business-friendly feature names
        String[] businessFeatures = {
            "customer_age", "annual_income", "credit_score", 
            "debt_to_income_ratio", "employment_years"
        };
        String targetName = "loan_approval_probability";
        
        // Export with custom names
        PMMLConverter converter = new PMMLConverter();
        String pmmlXml = converter.convertToXML(model, businessFeatures, targetName);
        
        // Validate and deploy
        boolean isValid = converter.validatePMML(pmmlXml);
        if (isValid) {
            Files.write(Paths.get("business_model.pmml"), pmmlXml.getBytes());
            System.out.println("โœ… Business model exported successfully!");
        } else {
            System.err.println("โŒ PMML validation failed!");
        }
    }
}

๐Ÿ”ง API Reference

PMMLConverter Class

The main class for PMML conversion with comprehensive functionality:

public class PMMLConverter {
    
    // Basic conversion with default feature names
    public String convertToXML(Object model)
    
    // Advanced conversion with custom feature names
    public String convertToXML(Object model, String[] featureNames, String targetName)
    
    // PMML validation
    public boolean validatePMML(String pmmlXml)
    
    // Future: Import from PMML (planned v3.1.0)
    public Object convertFromXML(String pmmlXml) // Currently throws UnsupportedOperationException
}

Method Details

convertToXML(Object model)

  • Purpose: Converts SuperML model to PMML with default feature names
  • Parameters: model - Any trained SuperML model extending BaseEstimator
  • Returns: Valid PMML 4.4 XML string
  • Throws: IllegalArgumentException for unsupported models or untrained models

convertToXML(Object model, String[] featureNames, String targetName)

  • Purpose: Converts model with business-friendly feature names
  • Parameters:
    • model - Trained SuperML model
    • featureNames - Array of custom feature names (null for default)
    • targetName - Custom target variable name
  • Returns: PMML XML with custom field names
  • Use Cases: Business reporting, cross-team collaboration, regulatory compliance

validatePMML(String pmmlXml)

  • Purpose: Validates PMML against schema and correctness requirements
  • Parameters: pmmlXml - PMML XML string to validate
  • Returns: true if fully compliant, false otherwise
  • Validation Checks: Schema compliance, required elements, data consistency

๐Ÿ“Š Model-Specific PMML Structures

Linear Regression Models

Generated PMML Structure:

<?xml version="1.0" encoding="UTF-8"?>
<PMML version="4.4">
  <Header>
    <Application name="SuperML Java Framework" version="3.1.2"/>
    <Timestamp>2025-07-20T22:30:00</Timestamp>
  </Header>
  
  <DataDictionary numberOfFields="4">
    <DataField name="age" optype="continuous" dataType="double"/>
    <DataField name="income" optype="continuous" dataType="double"/>
    <DataField name="education" optype="continuous" dataType="double"/>
    <DataField name="salary" optype="continuous" dataType="double"/>
  </DataDictionary>
  
  <RegressionModel functionName="regression">
    <MiningSchema>
      <MiningField name="age" usageType="active"/>
      <MiningField name="income" usageType="active"/>
      <MiningField name="education" usageType="active"/>
      <MiningField name="salary" usageType="target"/>
    </MiningSchema>
    
    <RegressionTable intercept="25000.50">
      <NumericPredictor name="age" coefficient="150.25"/>
      <NumericPredictor name="income" coefficient="0.45"/>
      <NumericPredictor name="education" coefficient="2500.75"/>
    </RegressionTable>
  </RegressionModel>
</PMML>

Key Features:

  • โœ… Coefficient extraction via reflection (getCoefficients())
  • โœ… Intercept inclusion (getIntercept())
  • โœ… Automatic feature name mapping
  • โœ… Continuous target type for regression

Logistic Regression Models

Generated PMML Structure:

<RegressionModel functionName="classification" normalizationMethod="logit">
  <MiningSchema>
    <MiningField name="feature_0" usageType="active"/>
    <MiningField name="feature_1" usageType="active"/>
    <MiningField name="prediction" usageType="target"/>
  </MiningSchema>
  
  <RegressionTable intercept="0.5" targetCategory="1">
    <NumericPredictor name="feature_0" coefficient="0.123"/>
    <NumericPredictor name="feature_1" coefficient="-0.045"/>
  </RegressionTable>
  
  <RegressionTable intercept="0.0" targetCategory="0"/>
</RegressionModel>

Key Features:

  • โœ… Binary and multiclass classification support
  • โœ… Logit normalization for probability interpretation
  • โœ… Class labels from getClasses() method
  • โœ… Target categories for all classes

Decision Tree Models

Generated PMML Structure:

<TreeModel functionName="classification">
  <MiningSchema>
    <MiningField name="feature_0" usageType="active"/>
    <MiningField name="feature_1" usageType="active"/>
    <MiningField name="class" usageType="target"/>
  </MiningSchema>
  
  <Node>
    <SimplePredicate field="feature_0" operator="lessOrEqual" value="35.0"/>
    <Node>
      <SimplePredicate field="feature_1" operator="lessOrEqual" value="50000.0"/>
      <Node score="0" recordCount="150"/>
    </Node>
    <Node>
      <SimplePredicate field="feature_1" operator="greaterThan" value="50000.0"/>
      <Node score="1" recordCount="75"/>
    </Node>
  </Node>
</TreeModel>

Key Features:

  • โœ… Hierarchical tree structure representation
  • โœ… Split conditions with thresholds
  • โœ… Leaf node predictions and sample counts
  • โœ… Both classification and regression support

Random Forest Models

Generated PMML Structure:

<MiningModel functionName="classification" multipleModelMethod="majorityVote">
  <MiningSchema>
    <MiningField name="feature_0" usageType="active"/>
    <MiningField name="feature_1" usageType="active"/>
    <MiningField name="class" usageType="target"/>
  </MiningSchema>
  
  <Segmentation multipleModelMethod="majorityVote">
    <Segment id="tree_0">
      <True/>
      <TreeModel functionName="classification">
        <!-- Individual tree structure -->
      </TreeModel>
    </Segment>
    <Segment id="tree_1">
      <True/>
      <TreeModel functionName="classification">
        <!-- Individual tree structure -->
      </TreeModel>
    </Segment>
    <!-- Additional trees -->
  </Segmentation>
</MiningModel>

Key Features:

  • โœ… Ensemble representation using MiningModel
  • โœ… Individual trees as segments
  • โœ… Majority vote for classification
  • โœ… Average aggregation for regression

๐Ÿš€ Cross-Platform Deployment

1. Apache Spark MLlib โšก

Deploy SuperML models in Spark environments:

import org.jpmml.sparkml.PMMLBuilder
import org.apache.spark.ml.Pipeline

// Load SuperML PMML in Spark
val pmmlModel = PMMLBuilder.load("model.pmml")

// Use in Spark MLlib pipelines
val pipeline = new Pipeline()
  .setStages(Array(pmmlModel))

val predictions = pipeline.fit(trainingDF).transform(testDF)

Supported Platforms:

  • โœ… Spark 2.4+, 3.x
  • โœ… Databricks
  • โœ… Amazon EMR
  • โœ… Google Dataproc

2. Python scikit-learn Integration ๐Ÿ

Use SuperML models in Python environments:

from jpmml_evaluator import make_evaluator
import pandas as pd

# Load SuperML PMML in Python
evaluator = make_evaluator("model.pmml")

# Make predictions
test_data = pd.DataFrame({
    'age': [25, 35, 45],
    'income': [50000, 75000, 100000],
    'education': [12, 16, 18]
})

predictions = evaluator.evaluate(test_data.to_dict('records'))
print("Predictions:", predictions)

Python Libraries:

  • โœ… jpmml-evaluator - High-performance PMML execution
  • โœ… sklearn2pmml - Integration with scikit-learn pipelines
  • โœ… pandas - Data manipulation and preprocessing

3. R Statistical Environment ๐Ÿ“Š

Deploy in R for statistical analysis:

library(pmml)
library(XML)

# Load SuperML PMML in R
model <- pmml::loadPMML("model.pmml")

# Make predictions
test_data <- data.frame(
  age = c(25, 35, 45),
  income = c(50000, 75000, 100000),
  education = c(12, 16, 18)
)

predictions <- predict(model, test_data)
print(predictions)

R Packages:

  • โœ… pmml - PMML import/export functionality
  • โœ… XML - XML parsing and manipulation
  • โœ… data.table - High-performance data processing

4. Enterprise ML Platforms ๐Ÿข

Deploy to enterprise machine learning platforms:

SAS Enterprise Miner

/* Import SuperML PMML */
proc model data=score_data;
   import pmml="model.pmml";
   score data=score_data out=predictions;
run;

IBM SPSS

* Import PMML model.
MODEL IMPORT FILE="model.pmml".

* Score new data.
COMPUTE prediction = PMML.PREDICT(age, income, education).
EXECUTE.

Microsoft Azure ML

# Deploy PMML model to Azure ML
from azure.ml import MLClient

ml_client = MLClient()
ml_client.models.create_or_update(
    name="superml-model",
    path="model.pmml",
    type="pmml"
)

Supported Enterprise Platforms:

  • โœ… SAS Enterprise Miner
  • โœ… IBM SPSS Modeler
  • โœ… Amazon SageMaker
  • โœ… Microsoft Azure ML
  • โœ… Google Cloud AI Platform

๐Ÿ”ง Advanced Usage Patterns

Model Versioning and Metadata

import org.superml.pmml.PMMLConverter;
import org.superml.pmml.PMMLMetadata;

// Add custom metadata to PMML export
PMMLMetadata metadata = new PMMLMetadata.Builder()
    .version("3.1.2")
    .author("Data Science Team")
    .description("Customer churn prediction model")
    .trainingDate(LocalDateTime.now())
    .dataSource("customer_database_v2.1")
    .performanceMetrics(Map.of(
        "accuracy", 0.934,
        "precision", 0.891,
        "recall", 0.876
    ))
    .build();

PMMLConverter converter = new PMMLConverter();
String pmmlWithMetadata = converter.convertToXML(model, featureNames, targetName, metadata);

A/B Testing Deployment

public class ABTestingDeployment {
    
    public void deployModelsForTesting() {
        // Train baseline model
        LinearRegression baselineModel = new LinearRegression();
        baselineModel.fit(X_train, y_train);
        
        // Train experimental model
        Ridge experimentalModel = new Ridge().setAlpha(0.1);
        experimentalModel.fit(X_train, y_train);
        
        PMMLConverter converter = new PMMLConverter();
        
        // Export both models with version tags
        String baselinePMML = converter.convertToXML(baselineModel, 
            features, "conversion_rate");
        String experimentalPMML = converter.convertToXML(experimentalModel, 
            features, "conversion_rate");
        
        // Save with version identifiers
        Files.write(Paths.get("baseline_v1.0.pmml"), baselinePMML.getBytes());
        Files.write(Paths.get("experimental_v1.1.pmml"), experimentalPMML.getBytes());
        
        System.out.println("โœ… A/B testing models deployed successfully!");
    }
}

Batch Model Export

public class BatchModelExport {
    
    public void exportMultipleModels(Map<String, Object> models) {
        PMMLConverter converter = new PMMLConverter();
        
        for (Map.Entry<String, Object> entry : models.entrySet()) {
            String modelName = entry.getKey();
            Object model = entry.getValue();
            
            try {
                // Export each model
                String pmmlXml = converter.convertToXML(model);
                
                // Validate before saving
                if (converter.validatePMML(pmmlXml)) {
                    String filename = modelName + "_model.pmml";
                    Files.write(Paths.get(filename), pmmlXml.getBytes());
                    System.out.println("โœ… " + modelName + " exported successfully");
                } else {
                    System.err.println("โŒ " + modelName + " validation failed");
                }
            } catch (Exception e) {
                System.err.println("โŒ Error exporting " + modelName + ": " + e.getMessage());
            }
        }
    }
}

๐Ÿงช Testing and Validation

PMML Validation Examples

import org.superml.pmml.PMMLConverter;

public class PMMLValidationExample {
    
    public static void main(String[] args) {
        PMMLConverter converter = new PMMLConverter();
        
        // Test various validation scenarios
        testValidation(converter);
        testErrorHandling(converter);
    }
    
    private static void testValidation(PMMLConverter converter) {
        System.out.println("=== PMML Validation Tests ===");
        
        // Valid model PMML
        LinearRegression validModel = new LinearRegression();
        // ... train model
        String validPMML = converter.convertToXML(validModel);
        System.out.println("โœ… Valid model PMML: " + 
            (converter.validatePMML(validPMML) ? "PASSED" : "FAILED"));
        
        // Invalid inputs
        System.out.println("โœ… Null PMML validation: " + 
            (!converter.validatePMML(null) ? "CORRECTLY REJECTED" : "FAILED"));
        
        System.out.println("โœ… Empty PMML validation: " + 
            (!converter.validatePMML("") ? "CORRECTLY REJECTED" : "FAILED"));
        
        System.out.println("โœ… Malformed XML validation: " + 
            (!converter.validatePMML("<invalid>xml") ? "CORRECTLY REJECTED" : "FAILED"));
    }
    
    private static void testErrorHandling(PMMLConverter converter) {
        System.out.println("\n=== Error Handling Tests ===");
        
        try {
            converter.convertToXML(null);
            System.out.println("โŒ Null model: Should have thrown exception");
        } catch (IllegalArgumentException e) {
            System.out.println("โœ… Null model: Correctly rejected - " + e.getMessage());
        }
        
        try {
            converter.convertToXML("Not a model");
            System.out.println("โŒ Invalid model: Should have thrown exception");
        } catch (Exception e) {
            System.out.println("โœ… Invalid model: Correctly rejected - " + e.getMessage());
        }
    }
}

Expected Output

=== PMML Validation Tests ===
โœ… Valid model PMML: PASSED
โœ… Null PMML validation: CORRECTLY REJECTED
โœ… Empty PMML validation: CORRECTLY REJECTED
โœ… Malformed XML validation: CORRECTLY REJECTED

=== Error Handling Tests ===
โœ… Null model: Correctly rejected - Model cannot be null
โœ… Invalid model: Correctly rejected - Unsupported model type

๐Ÿ“Š Performance and Monitoring

PMML Export Performance

Model Type Export Time PMML Size Validation Time
LinearRegression <1ms ~2KB <1ms
LogisticRegression <2ms ~3KB <1ms
DecisionTree 5-15ms 10-50KB 2-5ms
RandomForest (100 trees) 100-300ms 500KB-2MB 10-30ms

Memory Usage

  • Converter Instance: ~2MB baseline
  • PMML Generation: 2-5x model size in memory
  • Large Random Forests: Use streaming export (planned v3.1.0)

Production Monitoring

import org.superml.pmml.monitoring.PMMLExportMonitor;

public class ProductionPMMLExport {
    
    private final PMMLExportMonitor monitor = new PMMLExportMonitor();
    
    public void monitoredExport(Object model, String modelName) {
        long startTime = System.currentTimeMillis();
        
        try {
            PMMLConverter converter = new PMMLConverter();
            String pmmlXml = converter.convertToXML(model);
            
            // Validate
            boolean isValid = converter.validatePMML(pmmlXml);
            
            // Log metrics
            long exportTime = System.currentTimeMillis() - startTime;
            monitor.logExportMetrics(modelName, exportTime, pmmlXml.length(), isValid);
            
            if (isValid) {
                Files.write(Paths.get(modelName + ".pmml"), pmmlXml.getBytes());
                monitor.logSuccess(modelName);
            } else {
                monitor.logValidationFailure(modelName);
            }
            
        } catch (Exception e) {
            monitor.logError(modelName, e);
            throw new RuntimeException("PMML export failed for " + modelName, e);
        }
    }
}

๐Ÿ”ฎ Future Enhancements

Planned Features (v3.1.0)

  • Bidirectional Conversion: Import PMML models back to SuperML
  • Pipeline Export: Complete preprocessing + model PMML export
  • Streaming Export: Memory-efficient large model export
  • Custom Transformations: User-defined PMML extensions

Advanced Features (v3.2.0)

  • Neural Network Support: PMML export for MLPs and CNNs
  • Transformer Support: Limited PMML export for attention models
  • Ensemble Methods: Advanced ensemble PMML structures
  • Model Monitoring: Built-in drift detection in PMML

Integration Enhancements

  • Cloud Deployment: Direct deployment to cloud ML platforms
  • Container Support: Docker images with PMML runtime
  • REST API: Web service wrapper for PMML models
  • Dashboard: Model registry and deployment tracking

๐Ÿ“š Resources and Examples

Complete Examples

Community Resources


The SuperML Java PMML export functionality provides production-ready, cross-platform model deployment capabilities. With comprehensive model support, enterprise-grade validation, and seamless integration across multiple platforms, itโ€™s the complete solution for deploying SuperML models anywhere.

Ready to deploy your models? Start with our Quick Start Guide and explore the examples for your specific use case!