Data Science is Changing the Role of Subject Matter Experts in Projects
“There’s no going back, and there’s no hiding the information. So let everyone have it.” Andrew Kantor
“Sunlight is the best disinfectant.” William O. Douglas
There is a revolution in data analytics that is fundamentally changing how we think about the role of the Subject Matter Experts (SME), and the application of their knowledge to intellectual property type projects.
While I am dramatically simplifying these concepts, there are four high level processes you can follow to conduct analytics against a data set in support of a project. They are applicable regardless of what you are trying to analyze.
I would be interested if any of my Data Science friends would group them differently!
1) Consult with the SME’s and use their experience to identify the attributes that will be used to identify what you are looking for, then use these attributes to develop and automate your analysis. This works well with a predictive project management model because you can gather requirements in advance and build to them, but it relies almost entirely on the SME being correct.
2) Consult with the SME’s to identify the attributes they believe will be used to identify what you are looking for. Then apply a Bayesian derived algorithm to prove or disprove the hypothesis, make modifications and rerun the algorithm. This iterative process is followed until the output of the process has been validated. At this point the adoption of an agile or a predictive approach for the final product depends entirely on the nature the work.
3) Big Data statistical analysis. Take the entire population (if possible) and compare everything to everything until you have a statistically accurate mean. Then begin to look for outliers until you have built a model that that accurately identifies whatever you are looking for. The model is then validated by the SME’s and the appropriate project management model is chosen based on what the data is being used for.
Note that the role of the SME evolves from being the source of the requirements to validating the output of the analysis. All three of these methods maintain traceability from the data elements determined to have the highest value to the output of the analysis. Finally, while the first method clearly points to a traditional requirements elicitation and a predictive project management model, the project approach for the other two should be decided based on what your organization is trying to accomplish.
If you require a high level of documentation and traceability then you can view the data analysis as a form of requirements elicitation. Once you have the requirements, you can build a production system using those requirements to develop a traditional predictive project plan.
If you can use the output of the analysis without further development, you can view the data analysis as the project itself and use an agile methodology. There is a fourth type of data analytics.
4) Machine learning takes an entirely different approach.
A simple example used to explain machine learning goes like this: You want to teach a computer to recognize a cat. Using a traditional method, you would decompose the attributes of “cat” until you had captured them all, then you would write a software program that instructed the computer to check for those before determining if it was “looking” at a cat. The problem with this method is that there are many varieties of “cat”, and it would be almost impossible to differentiate between “cat” and “raccoon”.
The machine learning approach would be to develop an algorithm and then “show” the computer hundreds of thousands of pictures, identifying those that represented a cat until the algorithm “learns” what a cat looks like! The algorithm “teaches” itself what a cat looks like, however exactly how it did so is not known, just the output. This is referred to as a “black box” because you have lost traceability between the requirements that listed the attributes of "cat" and the actual identification of the cat.
From a project management perspective, this can be a problem for certain industries (think law enforcement) where there must be a reason to target an individual, and the reason can’t be “we don’t know, the machine told us to!” In instances like this, there must be traceability. Machine learning tools should be used as indicators, and the actual decision to proceed should be validated and augmented by SME’s. In this example, the SME makes the actual decision based on their expertise in analyzing the output.
The increasing use of big data, data analytics and machine learning make it even more important that the project manager has a clear understanding of the purpose of the project, the environment it will be deployed in and the environment that the project is being developed in.
I have an overseas trip coming up, and I am required to take the airline that my organization has negotiated the “best” rates. I love when we pick a new contract carrier because I suddenly go from concierge lounge access and upgrades to steerage class seats. I also noticed that a lot of the airlines “code share” for intercontinental flights, however you can’t get your seat assignment until 36 hours before the flight leaves! How is it possible that with todays technology I can’t log on and get a seat assignment? As technology gets better, large businesses should focus on the day to day customer experience. Come on, this is ridiculous!