Actionable Analytics Practical Analytics for Practical People

20Apr/090

Airline Gate Optimization

People who are used to quite a bit of long distance air travels with one or several transits in between, would have at some point in their lifetimes definitely experienced the “moments of anxiety” of missing the connecting flights because of the arrival delays of the flights they are currently travelling – not to forget most of the connecting flights are timed to less than a couple of hours and the worst could be as close as about 45 minutes (source: Frankfurt Airport). This stress especially increases to unbearable levels if the transiting airports happen to be of type as the Frankfurt, Heathrow, Chicago ORD airports etc.

Ever wonder the reasons why? Airports such as the ones described above have multiple terminals and given the distances that passengers need to travel to take the connecting flights in different terminals only increases this stress. To top it all, security measures at the airports have now increased manifold times that sometimes it becomes rather impossible to take the connecting flights should there be even a slightest delay in the arriving flights.
Does that mean we experience instant ‘nirvanic’ feeling should we somehow manage to sneak through all these barriers of distance, security gates etc. and manage to take the connecting flights? The answer would be still a ‘NO’ since then immediately there creeps this thought  ‘Oh but what about my baggage?’ This would leave the passenger almost transfixed for the rest of his/her journey wondering the status of the baggage.
So is this all about the passengers alone and do the airlines need bother not this problem at all? Well, not exactly! The airlines in fact take the maximum hit out of this sub or poorly optimized gate allocations. The costs involved in providing a passenger who missed a connecting flight with an alternate airline connection, providing accommodations in cases where alternate connections are not possible at that time, cost due to lost/misplaced baggage etc. increase exponentially with the number of passengers with missed connecting flights.
And, finally is there a way where this problem impacting both the passenger and the airlines company be addressed at one go?  Yes, this is where Operations Research Optimizations or OR as it is widely known comes handy.  The OR takes into consideration all the input variables that impact the outcome – in this case to minimize the distance travelled by passengers bound for connecting flights (and thereby ensure the number of passengers failing to catch a connecting flight is minimized – impacting the airline operation costs)
Linear/Non linear Optimization techniques can be used to compute the best possible gates that can be assigned to an incoming flight such that the total number of people who fail to catch connecting flights is kept to a minimum. Number of passengers bound for connecting flights, time and distance required for the baggage to be moved from the incoming flight to the connecting flight, number of personnel required, costs of providing a passenger with accommodation, costs of providing a passenger with alternate connection etc. will then on become the constraints to solving this gate assignment problem.
The gate optimization problem requires to take into consideration that the operations performed would be dynamic as the time delay of arriving flights delays are not known before hand(at least in most cases) .

Ever wonder the reasons why? Airports such as the ones described above have multiple terminals and given the distances that passengers need to travel to take the connecting flights in different terminals only increases this stress. To top it all, security measures at the airports have now increased manifold times that sometimes it becomes rather impossible to take the connecting flights should there be even a slightest delay in the arriving flights.

Does that mean we experience instant ‘nirvanic’ feeling should we somehow manage to sneak through all these barriers of distance, security gates etc. and manage to take the connecting flights? The answer would be still a ‘NO’ since then immediately there creeps this thought  ‘Oh but what about my baggage?’ This would leave the passenger almost transfixed for the rest of his/her journey wondering the status of the baggage.

So is this all about the passengers alone and do the airlines need bother not this problem at all? Well, not exactly! The airlines in fact take the maximum hit out of this sub or poorly optimized gate allocations. The costs involved in providing a passenger who missed a connecting flight with an alternate airline connection, providing accommodations in cases where alternate connections are not possible at that time, cost due to lost/misplaced baggage etc. increase exponentially with the number of passengers with missed connecting flights.

And, finally is there a way where this problem impacting both the passenger and the airlines company be addressed at one go?  Yes, this is where Operations Research Optimizations or OR as it is widely known comes handy.  The OR takes into consideration all the input variables that impact the outcome – in this case to minimize the distance travelled by passengers bound for connecting flights (and thereby ensure the number of passengers failing to catch a connecting flight is minimized – impacting the airline operation costs)

Linear/Non linear Optimization techniques can be used to compute the best possible gates that can be assigned to an incoming flight such that the total number of people who fail to catch connecting flights is kept to a minimum. Number of passengers bound for connecting flights, time and distance required for the baggage to be moved from the incoming flight to the connecting flight, number of personnel required, costs of providing a passenger with accommodation, costs of providing a passenger with alternate connection etc. will then on become the constraints to solving this gate assignment problem.

The gate optimization problem requires to take into consideration that the operations performed would be dynamic as the time delay of arriving flights delays are not known before hand(at least in most cases) .

Filed under: Analytics No Comments
8Apr/090

LatentView SAS to PMML Converter

LatentView Analytics has launched SAS-2-pmml (TM), a free, desktop-based application to generate PMML representations for your SAS models.

PMML is an XML-based language developed by the Data Mining Group, which provides a standard way to represent data mining models so that these can be shared between different statistical applications.

Representing your SAS models in PMML form gives you the flexibility to deploy your models using a whole range of applications that can "consume" PMML files, thus eliminating the need for expensive SAS licenses in the production environment.

The application takes SAS model datasets (EST-type datasets containing the parameter estimates) as input, and generates a PMML file (as per the PMML v3.2 specification) corresponding to each model.

You can use SAS-2-pmml to generate PMML files for regression models built using the REG and LOGISTIC procedures.

To check out more about SAS-2-pmml and download a copy of the application, please visit http://www.latentview.com/sas-2-pmml.html.

Filed under: Analytics No Comments
2Apr/090

Multi-Channel Analytics

Today I received an SMS from a popular flight career stating that base fares have been slashed to by 50%.I called the customer care to understand the terms and conditions of the offer and finally made online bookings of my holiday tickets online. What struck me was that I had used three channels of communication to make the purchase, and even more interesting was that I made the final purchase online, and the offer came to me through a SMS campaign. Further, how is the carrier going to attribute my ticket purchase to a successful conversion on the SMS campaign?

With advent of new channels of communication such as internet, mobile etc. in addition to traditional channels, customers can now potentially receive campaign information through one channel to make a purchase on another channel.

Increasingly companies are finding it difficult to tag the successful transactions to the right campaign. Also, sometimes it is difficult to identify the true value of a customer.

Why is it important?

A noted website conducted a study on its direct revenues and pull-through revenues- to its surprise, it realized that the pull-through revenue exceeded its direct revenues by three times. Let us look at a simple case of two campaigns

Campaign

If the online ad was evaluated only on the online purchase, we would make the conclusion that Campaign 1 was twice as effective as Campaign 2. However in reality, if we measure the direct and pull through revenues, Campaign 2 in fact has generated twice the revenue as Campaign 1.

The tables have turned! Wrong attribution of campaign revenues would mean sub-optimal allocation of marketing budgets leading to potentially inferior campaigns!

So what is the answer?

Multi-channel Analysis – Multi-channel analysis is the science of using data across different channels, identify common elements and study the true impact of channels, campaigns and resultant revenues.

Why is it so hard?

The answer is the “common” element. In the earlier example, the popular flight carrier has no way of knowing whether I heard about the reduction in fares through the SMS campaign. So is there no way? Well, there is - had the SMS campaign given the user a coupon code to use, and if I had used it to make the online purchase, the pull through revenue could be assigned to the SMS campaign (the “common” element here being the coupon code)

Data is the key

A key starting point in any multi-channel analysis process is to identify all the Data sources, from which the information needs to be integrated. It is also important to capture customer demographic data, in addition to channel data.

Once the data sources are identified, the most important step is to create the linkages in data through common elements. These could be as simple as using unique coupon codes (as illustrated earlier), service tags, reference numbers etc. Increasingly organizations are trying to use multiple 1-800 numbers to understand when multiple campaigns are run through different channels, how to attribute call volume and resultant sales to a specific campaign in a channel.

In the absence of these, creative methodologies would be needed to identify linkages; however these would be at a macro level but still can give good insights.

Conclusion

Organizations, which interact with their customers through multiple channels, would increasingly need to leverage multi-channel analysis to get a true picture of the customer and gain a competitive advantage in this economic climate.

Filed under: Analytics No Comments
30Mar/092

Who is the best – Sachin, Lara, Mark Waugh or Jayasuriya?

A favorite topic that journalists, newspaper column analysts, critics like to discuss is “Who is the best batsman?” Is it Sachin or Lara or Mark Waugh or Jayasuriya or someone else? Two things that crop to my mind is

  • Firstly, can we compute the performances of players?
  • Secondly, can we compare the computed performances for different players?

These two questions are pertinent as there is no one single metric that can truly compare across different batsmen as they have different strike rates, different aggregates, belong to different teams, play in different circumstances, played under different ground conditions etc. They have contributed to their respective teams differently – changing the momentum by scoring heavily, steadying the team’s position by staying longer at the crease, contributing with the fielding/bowling in addition to batting etc.

In short the questions are

  • Can we compare the performance of different individuals who take in multiple inputs in different proportions and produce multiple outputs in different proportions?
  • From the performance calculations, can we find individuals in whom improvements can be made? And if yes, to what extent can they be implemented?

In the absence of any clearly defined engineering formula relating inputs to outputs, the problem essentially becomes an empirical issue. A nonparametric approach is provided by the method of Data Envelopment Analysis (DEA).

To answer the above questions, Let us first take a single input - single output scenario as shown below

cricketer-stats5

The Plot for Performance of players is as shown below

performance21

The table below explains two approaches to compare performance namely - central tendency approach & extreme point approach with the help of above figure

approaches4

Data Envelopment Analysis (DEA) is a relatively new “data oriented” approach for evaluating the performance of a set of peer units called Decision Making Units (DMUs) which convert multiple inputs into multiple outputs. Unlike the central tendency approach, DEA is an extreme point method and compares each DMU with only the "best" DMUs (in the above example each player’s performace is compared with the performance of Flintoff).

A fundamental assumption behind an extreme point method is that if a given DMU - A, is capable of producing Y (A) units of output with X (A) inputs, then other DMUs should also be able to do the same if they were to operate efficiently. The heart of the analysis lies in finding the "best" DMU for each DMU. If the "best" DMU is better than the original DMU by either making more output with the same input or making the same output with less input then the original DMU is inefficient. The procedure of finding the best DMU can be formulated as a linear program. The comparison of the actual output produced with the benchmark quantity (quantity produced by the Best DMU) yields a measure of technical efficiency. This is different from economic efficiency, in which one compares the profit resulting from the actual input–output bundle with the maximum profit possible.

In the above example, compared with the best player Flintoff, the other players are inefficient. We can measure the efficiency of others relative to Flintoff by

relative-performance3

The relative performance of players in the example discussed above is as shown below

player-analysis

The figure below shows how to make the inefficient players efficient i.e. how to move them up to the efficient frontier. For example, Ponting (as shown in the figure below) can improve in several ways. One is achieved by reducing the input (number of balls) to P1 on the efficient frontier. Another is achieved by raising the output (Runs) up to P2. Any point on the line segment P1-P2 offers a chance to effect the improvements in a manner which assumes that the input should not be increased and the output should not be decreased in making the player efficient.

pontings-performance-analysis3

Thus using DEA, we can compare the performance of Sachin, Lara, Mark Waugh, Jayasuriya, Saeed Anwar over different inputs like No. of balls played, No. of matches palyed, Average ranking of the opposition etc against the outputs like Strike rate, Average, No. of Not outs, No. of victories, No. of Man of the match awards etc.

Strengths of DEA*

  • DEA can handle multiple input and multiple output models.
  • It doesn't require an assumption of a functional form relating inputs to outputs.
  • DMUs are directly compared against a peer or combination of peers.
  • Inputs and outputs can have very different units.

Limitations of DEA*

  • Since DEA is an extreme point technique, noise (even symmetrical noise with zero mean) such as measurement error can cause significant problems.
  • DEA is good at estimating "relative" efficiency of a DMU but it converges very slowly to "absolute" efficiency. In other words, it can tell you how well you are doing compared to your peers but not compared to a "theoretical maximum."
  • Since DEA is a nonparametric technique, statistical hypothesis tests are difficult and are the focus of ongoing research.
  • Since a standard formulation of DEA creates a separate linear program for each DMU, large problems can be computationally intensive.

Though there are a few limitations in applying the DEA technique, it seems to be one of the most useful and powerful techniques using which we can compare the performance of politicians, film actors, universities, business firms, countries,  regions etc. where every individual unit converts  multiple inputs into multiple outputs.

(* Sourced from http://www.emp.pdx.edu/dea/homedea.html)

10Mar/092

Adding a “Human” touch to Online Shopping

Ever faced a situation where you are just unable to locate the specific item you are looking for in a website – be it a rate plan or the accessory product or maybe some specific product. This is similar to losing yourself in the aisles of a supermarket looking for the specific brand /product you want, which is hidden behind a maze of products that you have no interest in?

In a supermarket, a shop-floor attendant would drop by and politely ask “May I help you?” Online shopping could be much more impersonal; one can’t get advice on whether the product meets your needs as you would ask a shopping assistant.

“Proactive” chat solutions address this issue to a degree by using a business-rules based engine, with pre-defined business rules, to decide when to proactively display a “May I help you” pop-up.  A survey by InstantService showed that…. Web chatters spend 35% more per order… 20% of web chats resulted in a completed purchase.

Though this addresses the needs for providing online assistance, it comes with a new set of issues to tackle – some people view it extremely intrusive, there have been privacy concerns stating that they feel their movements are being monitored on the web. And this can potentially drive shoppers away!

This is where predictive analytics makes a difference - Based on the different responses to the proactive chat pop-up by different customers, predictive models continuously learn & adapt and determine: Whether the user would need “help” online, what is the right time to pop the screen.

By optimally identifying the right customer to offer the proactive chat service, we can dramatically improve revenue per visit and reduce associated drop-out rates which are the core issues for any e-commerce website.

In case you are a user of proactive chat or provide proactive chat solutions, we would be glad to hear / discuss your viewpoints.

2Feb/090

A Short PMML Tutorial

 

Was this useful? Was this adequate? Please post your comments

31Jan/091

Is PMML Useful for Model Deployment?

At LatentView, we develop a variety of predictive analytics solutions for our clients. Our deliverables typically include a power point that highlights key findings and the consequences, excel sheets to validate business hypotheses and model equations in some form for ongoing scoring. The last one is the topic of this post.

There are a variety of approaches to deliver models. For simple techniques such as Logistic Regression, we typically deliver it in Excel, or in the form of a VBA-based decision simulator. For more complex techniques, we deliver a set of SAS codes that can then be used by the client to score models on an ongoing basis. These SAS codes consist of logic for data preparation, model scoring and validation.

This is where PMML plays a key role. LatentView is standardizing on a PMML-based approach for delivering predictive models. PMML promotes model portability across different platforms, model maintenance, and better lifecycle management. It's faster and easier to score and validate with PMML models. PMML also makes it easy to develop a visualization tool.

However, from the looks of it, PMML is not so widely used.

Some of the drawbacks of PMML include potential loss of accuracy, alleged lack of support for a variety of models, lack of support for complex transformations and lack of availability of third party scoring engines that read PMML and score models. However, tools like Zementis have helped overcome some of these drawbacks, and I believe there are more such offerings in the pipeline. There's also a need for open source scoring engines that use PMML.

Today we use R to generate the PMML files. However, we understand that there's a need for an easier way to create PMML files from SAS, SPSS or other standard packages (rather than buying their most expensive licenses).

What do you think of PMML? Do you deploy models in PMML? Does it meet your model deployment needs? Why / Why not? Please post your comments here.

16Jan/090

Are Pharma Sales Reps going to be obsolete soon?

Given the economic downturn, the entire corporate sector is in cost cutting mode. Pharma companies certainly aren’t going to be left behind. They are digging deep into their expenses and evaluating the returns from every investment that has been made.  

 While studying effectiveness of various methods of promotion, Pharma companies are beginning to wonder whether they are spending too much money on reps, especially for not much return. One of the clear indicators is the way leading Pharma companies are cutting down on their sales force – In the US, there were around 94,000 reps in 2007 compared to 92,000 in 2008.  

 The question that marketers are asking is that ‘If samples are the most effective medium of influencing prescriptions, are Sales reps reduced to being well-dressed delivery boys?’ Are their salaries justified? With the increased number of doctors who have been ordering samples online, one begins to wonder why Reps are required at all! Physicians say they prefer the online medium anyways because they can do it at their own convenience and don’t have to answer questions regarding their prescription habits every time they ask for a sample. 

 Given that there are few new products launches, owing to drying pipelines, do companies still need detailers who drop off samples and make hardly any connection with the doctor? Do reps really make any difference to the prescribing behavior of physicians?

31Dec/080

Welcome to LatentView’s Blog

Welcome to LatentView's blog!

Today's organizations are literally drowning in data. They have the tools needed to capture and record every transaction and most interactions with their customers. In some contexts, such as  online commerce or telecom, every aspect of customer behavior is recorded, leading to a virtual explosion of information. Added to all this are a wide array of tools to slice, dice, drill-down, drill-across, chart or pivot the data in unimaginable ways.

All this deluge of information leaves them with a good idea of what happened in the past ("who is buying which products?"), though not with why something happened ("what are the drivers of lifetime value?") or what is the best action to take in a given situation ("what offer should i make to this customer to increase sales?").

This is where predictive analytics can help. Predictive analytics is an analyst-guided discipline that helps clients understand complex behavior or forecast future outcomes by identifying and extrapolating patterns across data.

We, at LatentView, are constantly striving to combine creativity, business knowledge and hard math to help clients transform their marketing, efficiently trade-off risks against available opportunities and lengthen and deepen their customer relationships.

This blog is an attempt to share our experiences with you. We hope that you would enjoy sharing your experiences with us too!