Sunday, February 17, 2013

What is a Scientist? Is a Data Scientist a true Scientist?

The past week I was thinking and reading about the definition of a scientist. The title “scientist” is not like “attorney" or “police officer” which have clear legal definitions. Therefore, anyone can really apply their own definition. For example: Christian Scientists. Another example might be someone who called themselves a homeopathic scientist. I, for one, would not consider either of these examples scientists but there is nothing stopping the word “scientist” from being used in these ways. That being said, I want to explore the working definition(s) of “scientist” that most scientifically literate people should be able to agree on.

The online dictionary entries that I found all essentially defined a scientist as a person who is an expert in or is studying one of the natural or physical sciences. This definition makes me uncomfortable because it doesn’t necessitate that the person be doing science on a regular basis. Here is a good discussion related to this subject http://www.physicsforums.com/showthread.php?t=515662.

Wikipedia basically breaks it down into two definitions:

Broad definition: person engaging in a systematic activity to acquire knowledge.

Narrow definition: person who uses the scientific method.

I have trouble with the broad definition simply because if a person is not using the scientific method, are they really doing science? At this point, I like the following definition:

Scientist: A person who does science on a daily basis.

The question then becomes: what does it mean to be doing science? The answer has to be: following the scientific method. Therefore, I have to conclude that Wikipedia’s narrower definition is the correct definition of a scientist.

I was thinking about the definition of a scientist because I was trying to answer the question: Is a data scientist a true scientist? My next post will be on the definition of a data scientist. For now, I’ll just say: no, a data scientist is not a real scientist. Please comment if you don't agree but a data scientist generally does not practice full disclosure, an important part of the scientific method. Full disclosure being the sharing of data and methodology. This is because a data scientist's work is usually corporate intellectual property.

7 comments:

Unknown said...

I'm no expert, but I would think that a lot of science happens without full disclosure. If transparency is a requirement for true science, wouldn't that disqualify any work leading to patents or other trade secrets? Medicine, engineering, etc.?

Charles Jenkins said...

I started to rethink my stance with the thought that maybe there is a viable substitute to full disclosure such as government reporting requirements or internal review within an organization. You mentioned Medicine first so I googled “is pharmaceutical research real science” and ended up at this Ted Talk by Ben Goldacre. My stance has hardened. While private research and development can lead to real things, the processes fall short of the scientific method and are not worthy of the term science.

Doug Jenkins said...

I think that a narrow definition might not be possible. As an under grad with a chemistry major, I am in the laboratory often. I don't think that it could be argued that I am not "doing science". However, I have not even earned a degree I'm my field and I am not making any discoveries during my lab work.

Charles Jenkins said...

When we were talking about this on Saturday, I remember you saying that doing research was essential to being a scientist. Is that what you said? (I don’t trust my fallible human memory) It was with that conversation in mind that I continued to think and read about the definition of science.

What is the purpose of your lab work? Are you making and testing new hypotheses or are you reproducing past experiments? If it’s the former and you are sharing your data and methodology then you are doing science. If you are reproducing experiments then you are doing science because that is an important part of the scientific method. Not incidentally, full disclosure of previous runs of an experiment is important to reproducing it.

My definition was; scientist: a person doing science on a daily basis. Maybe it’s the “daily basis” that is the difference. Or maybe it is the difference between being a professional scientist or not.

Unknown said...

Okay I watched the TED video and I take your point that not fully disclosing results is wrong, being that it is dishonest and antithetical to the scientific community.
Not to be to persnickety, one of your definitions of scientist is "person who uses the scientific method." After doing a few searches to refresh my memory, I'm not seeing anything in the "scientific method" that includes sharing or publicizing results. To use the example of pharma companies, I contend that they employ scientists who do science. What the company does with the results is another matter. Meaning not publishing negative results, for example.

Unknown said...

*too

I used "to" instead of "too" in a sentence with the word persnickety. Irony!

Charles Jenkins said...

The third paragraph of the Wikipedia article on the Scientific Method states:

"Another basic expectation is to document, archive and share all data and methodology so they are available for careful scrutiny by other scientists, giving them the opportunity to verify results by attempting to reproduce them. This practice, called full disclosure, also allows statistical measures of the reliability of these data to be established (when data is sampled or compared to chance)."

I believe we've narrowed it down to the question:

Is full disclosure a necessary part of the Scientific Method?

I'm not comfortable answering no to that.

I concede that most people use a loser definition and in practicality I will operate under a loser definition at times; however, I will remain defiant and argue my stance when appropriate. I hope that over time a stricter definition will be generally recognized.

By the way, I stumbled on this article: Comparing the Engineering Design Process and the Scientific Method . It is related and interesting and also lends credence to your point, Paul, that full disclosure is not always included when the Scientific Method is described.