Ten years ago I was working as product manager and I had problems explaining to my parents what exactly am I doing.
Now, as a Data Scientist, the challenge is still there.
No, I am not creating reports based on data — data analysts do.
No, I am not inventing next artificial intelligence — scientists do.
No, I am not just training machine learning models — a) most of the available data cannot be used for that so I need to be involved in data engineering first to get usable data, b) I can’t just take a requirement from customer or product management and train a model implementing it, because machine learning is not magic and can’t do everything, so I need to work with the customer or stakeholder first, to come up with a requirement that would both be technically feasible and provide some value for their business, and c) training of LLMs is very expensive and there are other ways to use them
No, I am not a software developer — software developers implement user stories and tickets assigned to them. I decide myself what to implement and why. Besides, I don’t care about proper software process — my goal is to discover a product feature that uses data in a way that brings money. I usually don’t see any reason to create software documentation, write unit tests, implement CI/CD pipelines, use branches and merge requests and code reviews — not at least before we have first three paying customers.
No, I am not an infrastructure guy — the devops are.
What I would like to be doing?
- I want to make sure there is data available, in quality and quantity that I need to be able to start working.
- I want to create product ideas promising some business value by using this data.
- I want to test the product ideas by implementing a prototype and trying to sell it to the customers.