Академический Документы
Профессиональный Документы
Культура Документы
Mr. Glasbrenner
CDS-101
Writing Assignment
Statistics
Computational science methods have impacted the modern era in ways that we may not initially
realize. Statistics is a sub-domain of data science and data science uses methods that can be drawn from
this field. There is a wide range of topics associated with statistics such as describing and displaying data,
modeling and regression, experiments and sampling, hypothesis tests and confidence intervals, and
probability. The purpose of statistics is to answer questions using data, and data scientists use their ability
Researchers in statistics try to describe what is going on and predict/decide outcomes. They use
statistical methods to determine variables in the data, find out how the data is collected, as well as how
much data is collected. Data scientists usually have a strong knowledge of basic statistics and machine
learning. They are capable of visualizing and summarizing their data and analysis in a way that it is more
comprehendible for those less acquainted in data. Two primary computational data science methods that
is utilized with statistics is data sampling and regression (The Right Questions...).
The study of sampling is the most effective statistical method that will optimize the amount of
data we can gather while minimizing the level of effort. It is impossible to study an entire population so
people will rely on this method for research. Sampling is the statistical method of selecting a suitable
sample and obtaining representative data or observations. Having such information is the core and
foundation of any research because they represent real-world data as long as the group selected
adequately represents the population in a nonbiased systematic manner (which can be accomplished
Regression directly correlates as a fundamental method of data science. Regression tells you a
formula for how an outcome varies based on other information. In other words, it focuses on the
relationship among variables, more specifically between a dependent and independent or explanatory
variables. It does not tell you if some things cause others, but only how to calculate them as accurately as
possible. Is it the most commonly used method when it comes to predictions. Regression tries to predict a
numerical value of some variable for a certain thing or individual. An example of a real-world problem of
regression would be, What will be the price of a particular house?. The variable that needs to be
predicted is the price of the house and by looking at the previous prices of other, similar houses in the
population, a model can be formed which is helpful with interpreting data. Regression relates variables to
each other and can solve dissimilar and complicated relationships between them (KDnuggets.).
Computational and data science has a significant role within statistics. It uses statistics as one of
the most important tools to its foundation and progression. The fundamentals of mathematical statistics
that is taught today adds significant value and identity to everyone as mathematicians in data science.
Data science takes statistics into a greater focus on real world data analysis and computing. They
work together to break down the analyze the details of given information and relationships.
References
"The Identity of Statistics in Data Science." Amstat News - Monthly Membership Magazine of the
"KDnuggets." KDnuggets Analytics Big Data Data Mining and Data Science. N.p., n.d. Web. 16
Nov. 2016.
"The Right Questions about Statistics." The Right Questions about Statistics | Maths Learning
"Sampling - Yale University." Sampling - Yale University | Better Evaluation. N.p., n.d. Web. 16
Nov. 2016.