A Short PMML Tutorial
Was this useful? Was this adequate? Please post your comments
Is PMML Useful for Model Deployment?
At LatentView, we develop a variety of predictive analytics solutions for our clients. Our deliverables typically include a power point that highlights key findings and the consequences, excel sheets to validate business hypotheses and model equations in some form for ongoing scoring. The last one is the topic of this post.
There are a variety of approaches to deliver models. For simple techniques such as Logistic Regression, we typically deliver it in Excel, or in the form of a VBA-based decision simulator. For more complex techniques, we deliver a set of SAS codes that can then be used by the client to score models on an ongoing basis. These SAS codes consist of logic for data preparation, model scoring and validation.
This is where PMML plays a key role. LatentView is standardizing on a PMML-based approach for delivering predictive models. PMML promotes model portability across different platforms, model maintenance, and better lifecycle management. It's faster and easier to score and validate with PMML models. PMML also makes it easy to develop a visualization tool.
However, from the looks of it, PMML is not so widely used.
Some of the drawbacks of PMML include potential loss of accuracy, alleged lack of support for a variety of models, lack of support for complex transformations and lack of availability of third party scoring engines that read PMML and score models. However, tools like Zementis have helped overcome some of these drawbacks, and I believe there are more such offerings in the pipeline. There's also a need for open source scoring engines that use PMML.
Today we use R to generate the PMML files. However, we understand that there's a need for an easier way to create PMML files from SAS, SPSS or other standard packages (rather than buying their most expensive licenses).
What do you think of PMML? Do you deploy models in PMML? Does it meet your model deployment needs? Why / Why not? Please post your comments here.