Reading these articles, it becomes clear that the author, David Brooks, does not view statistics and data analysis as the end-all-be-all of decision making. To the contrary, he makes no qualms about pointing out how large amounts of data can in fact be detrimental to the decision making processes. In the first article, The Philosophy of Data, Mr. Brooks points out that there are a large number of things that data analysis truly excels at. These mostly consist of situations in which our personal preconceptions can get in the way of the information. The examples offered by the author all involve times when, in his words, Our intuitive view of reality is wrong.(Brooks, 2013) In the field of Network Security, there are certainly many uses for statistical information. Many antivirus companies publish yearly trends based on the infection information that is gathered by their software. This statistical data can then be used to help harden systems against the highest levels of threats found in the wild. Part of the reason that this data is particularly useful is that it is taken from a very large sample group. Most people who own computers have an antivirus program installed and, unless they specifically opted out of the reporting, any information about infections gets collected. This is one of the key areas where the field of statistics offers a major advantage in the Information Technology world. Network and usage data is created faster and with more quantity than it can even be stored. This massive amount of Raw data also has very little bias in its collection methods. Because of these characteristics, this data can be analyzed in order to give us real and valuable information. This information can help us to make better choices when trying to prioritize our systems security planning, as well as giving us an increased ability to respond to possible
Page 2
Article Analysis Joshua Lackey
threats by showing correlation between network activity and active attacks. By utilizing the high speed of the computer systems to process the data, we are now able to respond as a threat happens instead of retroactively. In this sense statistics is not only a helpful tool, but a necessity. As stated by Phil Hollows in his article, Security Threat Correlation: The Next Battlefield, Fast, real-time response using the correlation architecture that's appropriate for your environment is the only way to contain modern threats.(Hollows, 2002). While this is all very useful, there are situations where the data can lead us astray. In Mr. Brooks second article, What Data Cant Do, he focuses on the limitations of data. These limitations have to do with problems that data is simply not very helpful at solving. Some situations, such as what decisions a person might make; or how people will respond to our own decisions; require a level of involvement that cannot be obtained by reviewing the numbers. As was stated years ago by Dr. Harold Highland in response to a magazine article titled, Security Of Information and Data, Some authors feel that the use of statistics adds scientific character to their work. Is it not about time to recognize that computer security is more an art than a science? Numeric data have their place in security books, but only if they are meaningful. (Highland, 1989) I like the way that the Mr. Brooks broke down the issues, so I will touch on some of his points and how they relate to the field of Network Security.
Page 3
Article Analysis Joshua Lackey
The first limitation that the author discusses is how, Data struggles with the social (Brooks, p. 3). One of the biggest threats faced by modern IT security is the threat posed by the insider. Indications that an employee may become a threat are something that the data does not predict well, but that a good manager or coworker will pick up on. I wont go into the long list of indicators here, but mostly they are emotion based. Another issue that Mr. Brooks brings up is that data is not aware of context. While statistics may help us with knowing the most active threats, or recognizing threat patterns created by automated attack; it is often the case that the biggest threats are posed by real people engaged in real attacks. In order to predict these threats it is necessary to be able to place yourself into a scenario where you are the attacker and try to reason as they would; something that numbers simply do not do. The author goes on to state that data can often create a bigger mess by find correlations that are of no use to us. As more and more of these correlations appear we can be lead into believing that intrusion threats exist where there are none. Even worse than this, it is often the case that there can be so much data collected that when we actually do see evidence of intrusion into the network; it is dismissed as innocuous noise. In conclusion, I have to agree with the author of these two articles. In the field of Network Security, as well as in Information Technology as a whole, we use data continuously for many things, but it is important to realize its limitations. In the right situations data can be one of the most useful tools we have, but when used incorrectly it can cause us to take the wrong course of action.
Page 4
Article Analysis Joshua Lackey
1. Brooks, David. "The Philosophy of Data" NYTimes 4 Feb. 2013 2. Brooks, David. "What Data Cant Do NYTimes 4 Feb. 2013 3. Hollows, Phil. "Security Threat Correlation: The Next Battlefield." ESecurity Planet. N.p., 14 Nov. 2002. Web. 04 Dec. 2013. 4. Embracing Statistical Challenges in the Information Technology Age Bin Yu Technometrics Vol. 49, Iss. 3, 2007 5. Highland, Harold J., Dr. "Security Statistics." Computer Fraud and Security Bulletin. N.p., Nov. 1989. Web. 04 Dec. 2013. 6. Cottrell, R.L.A. "ANALYSIS OF NETWORK STATISTICS." Computer Physics Communications 45.2 (1987): 93-103. Print.