Вы находитесь на странице: 1из 11

COSC3000 Assignment 2

OpenFlights & Google Earth


By Kristian Dawkins
41765188

11

Contents
Executive Summary ..............................................................................................................................................3 Introduction ..........................................................................................................................................................3 OpenFlights ...........................................................................................................................................................4 The Data ...........................................................................................................................................................4 Airports .........................................................................................................................................................4 Airlines ..........................................................................................................................................................4 Routes ...........................................................................................................................................................5 Cleaning the Data .............................................................................................................................................5 The Application .....................................................................................................................................................6 MATLAB ............................................................................................................................................................6 Google Earth Toolbox .......................................................................................................................................6 Problems ...........................................................................................................................................................7 Application Functionality ......................................................................................................................................7 Search Tools......................................................................................................................................................7 Routes by Airline...............................................................................................................................................8 Airport Density .................................................................................................................................................9 Airports by Country ....................................................................................................................................... 10 Conclusion ................................................................................................................................................. 10 References ......................................................................................................................................................... 11

2|Page

Executive Summary
In short, this project allows the user to, through the use of Google Earth, view the various airports and flight paths around the world through the use of a (fairly) up-to-date dataset. The dataset in question has been compiled by the not-for-profit organisation OpenFlights and its members. This dataset contains every airport, airline and route that those airlines operate on in the world. OpenFlights provides its users with access to this dataset free of charge, and even supplies its users with an example application of what can be done using a subset of the data available. By using this dataset in conjunction with MATLAB and Google Earth, users are able to view the flight paths of various airlines, airports that are in service around the world and the total number of flights to and from those airports. With the aid of Google Earth, users are then able to go one step further and view the actual airport. Due to how this project creates stand-alone Google Earth files, it is possible for users to share the files that they have created with others who may not have this project installed, but do have Google Earth. This allows prospective users to examine the end result before setting the project up for themselves.

Introduction
The following report will outline the following; what this project is, who would be interested in using it, and how the project was completed. The project itself is an application written in MATLAB[1] that allows users to filter a large set of data and create a series of files that are viewable in Google Earth [2]. The data in question consists of a relatively up-to-date account of every airport, airline, and route that those airlines fly along in the world. Users can use this filtered data to then view the routes that a chosen airline flies with the use of Google Earth. There are other filters that the application provides, which will be explored in more detail throughout the report. The use of this application is largely academic, more for the use of those who are interested in seeing world flight data rather than being for anyone with more practical interest. Unfortunately, in its current state, only the end result is readily distributable. In order to filter the data on your own, a variety of software packages would need to be installed, and non-trivial instructions would need to be followed that would be too much effort for a purely academic purpose. This, too, will be discussed in more detail throughout the rest of this report.

3|Page

OpenFlights
OpenFlights[3] is a not-for-profit organisation that supplies a wealth of airline data from all over the world. The data supplied by OpenFlights is compiled using both data made public by respective airlines, and data submitted by the OpenFlights community. The number of airlines, airports and routes made available by OpenFlights grows daily. There is also a publicly available application[4] produced by OpenFlights that uses a subset of the data available, however, you are required to register in order to get full access.

The Data
The OpenFlights data itself is well-formed and comes in three large files. The data within these files are separated by commas which makes reading the data an easy task. These files contain information about the various Airports, Airlines and Routes around the world and will be explored in this section. The following information is viewable on the OpenFlights website[3], but has been reiterated here for convenience. Airports At the time of last access, the data for 6,344 airports was made available by OpenFlights. For each of these airports in the OpenFlights dataset, the following information is recorded: Airport ID A unique identifier set by OpenFlights. Name The official name of an airport. For example, Brisbane Intl. City Country IATA/FAA A three letter code used to identify airports on a global scale. o Or domestic, in the case of airports located within the United States of America ICAO A four letter code, again used for global identification. Latitude Longitude Altitude Timezone DST Used to determine by which location the airport determines its daylight savings time.

Airlines At the time of last access, the data for 13,930 airlines was made available by OpenFlights. For each of these airlines in the OpenFlights dataset, the following information is recorded: Airline ID A unique identifier set by OpenFlights. Name The official name of an airline. Alias An alias for the airline. IATA A two letter code used for global recognition. ICAO Similarly to the IATA, only three letters long. Callsign The callsign for an airline. o In the case of British Airways, this is SPEEDBIRD. (Minor discussion on this later) Country The country where the airline is registered as a corporation. Active Whether or not the airline is now defunct. o Noted to be inaccurate / only a rough indication.

Unfortunately, quite a lot of this data is incomplete or consists of a large number of inactive airlines. More information about this will be made available in the next section. 4|Page

Routes It should be noted that a route merely represents the fact that an airline travels from one airport (the source) to another airport (the destination) and is not indicative of the actual flight path flown. Such data is not made available by OpenFlights. At the time of last access, the data for 55,751 routes was made available by OpenFlights. This data grows rapidly, sometimes growing by as much as 1,000 to 2,000 per day, viewable here (http://openflights.org/about). For each of these routes within the OpenFlights dataset, the following information is recorded: Airline Either the IATA or ICAO code of an airline. Airline ID The unique identifier for an airline, set by OpenFlights. Source Airport The IATA or ICAO code of the source airport. Source Airport ID The unique identifier for the source airport, set by OpenFlights. Destination Airport The IATA or ICAO code of the destination airport. Destination Airport ID The unique identifier for the destination airport, set by OpenFlights. Codeshare Whether or not this airline shares this particular route with a partner. Stops The number of stops on the way to the destination. Equipment A list of three letter codes, representative of the types of planes that are used by this airline along this particular route.

Cleaning the Data


A lot of the data supplied by OpenFlights, airlines in particular, was either incomplete, not currently relevant (i.e. airlines were inactive), or not important for the purposes of the application. The following figure depicts the difference in the data made available by OpenFlights, and the data that was actually used in the application.

Figure 1 - A comparison between available data and data used.

The main reason for removing the IATA and ICAO codes and sticking to the identifiers set by OpenFlights was the fact that its computationally inexpensive to perform a comparison between numbers than it is to on the two to four letter codes that are used in reality. This resulted in a dramatic increase in performance. 5|Page

In the case of time-related data, as the application merely displays the connections between airports, such information was not deemed necessary. Similar reasoning was given to the exclusion of the equipment and codeshare fields in the route data. Whilst it might be worth knowing that a particular route is being shared with another, there is currently no way to determine which airlines are codesharing with one another. As for removing the Name and Alias fields for the airline data, it was found that not all airlines had an alias, and initially, there was some difficulty in importing the name field. As every airline had a callsign, and it was always one word without spaces or special characters, a temporary solution was made using this field instead. The temporary solution was never reversed, but as a solution for importing special characters and spaces was found toward the end of the project, this could be fixed with relative easy. In its current state, however, British Airways remains a somewhat tricky airline to search for. It should also be noted that, as Google Earth is essentially a 2D map wrapped around a sphere, the altitude field for airports was not deemed necessary. After deciding on what fields would be used by the application, incomplete or unusable data had to then be removed. This resulted in the number of airlines dropping from 13,930 to 842. This was largely due to the airline having no official name (placeholders, perhaps) or being inactive. As all of the inactive airlines had then been deleted, that field was no longer necessary and was removed. More attempts at cleaning the data were then made, however, the airport data was without flaw and only ~4% of the routes were missing either a source or destination. With the data in order, it was then time to finish the application.

The Application
The application itself was made in MATLAB, a software package intended for Engineering, Scientific and Economic purposes with a plug-in to aid in the creation of files that can be used with Google Earth.

MATLAB
The main advantage to using MATLAB was the fact that it has a vast library of functions making the implementation of mathematical concepts trivial, and eliminating the need to do any required reading on the subject. The prime example for this lies in MATLABs Distance function. This made calculating the distance between two co-ordinates on Earth a simple function call. Everything to do with the curvature of the Earth and the like was no longer a problem. The other advantage MATLAB had for this project was the ease of importing data. There were some issues at first, as the OpenFlights data came as a set of comma separated values[6] (CSVs), data fields with spaces or special characters would be read as a series individual values. However, after converting these CSVs to a spread sheet document type (Microsofts XLS[5]), MATLAB was able to read the data as intended.

Google Earth Toolbox


The Google Earth Toolbox[7] is a plug-in for MATLAB that enables the creation of files that are used by Google Earth. The plug-in is an open-source project, completed in June 2009 and has been invaluable in terms of this 6|Page

project. Whilst there is sufficient documentation[8] released by Google to create what is necessary for this application, the Google Earth Toolbox has dramatically reduced the required development time.

Problems
The major problem with the current application is that it does require the user to have MATLAB installed, which, as described in the start of this section, is aimed for a niche market. This significantly reduces the number of users who may be interested in using such an application, but are unable to do so. It should also be noted that, even if a user has MATLAB installed, they would be required to follow a number of steps in order to prepare the applications data structures which could be viewed as quite a hassle. These problems have resulted in the application being more of a functional prototype, something that could be implemented in a language such as C++ to ease distribution. User friendliness is also an issue.

Application Functionality
The following section will go through the functions provided by the application with some detail.

Search Tools
Due to how the data is formatted, it is necessary for the user to provide the identifier set by OpenFlights when generating Google Earth data based on either a specific airline or airport. To aid in this, the application comes with some search tools that list airlines and their identifier based on either the country in which it is located, or by a partial match based on a name the user inputs. For example, a user inputting the letters QA for a partial match would find both the airlines QANTAS and QATAR in the result set. In order to find the OpenFlights identifier for an airport, users are required to locate it manually in Google Earth using the results of one of the other functions in this section. This can be seen in figure 2.

Figure 2 - The airport information for Narita International, Japan. Note the OpenFlights identifier seen in square brackets. [2279]

Also seen in figure 2 in bold NRT is the IATA code for the airport, under this is the official name of the airport Narita Intl. The next line depicts the OpenFlights identifier [2279] next to the name of the 7|Page

country in which the airport resides Japan. The last two lines were not intentionally included, but are artefacts created by Google Earth for other uses.

Routes by Airline
The most used function. By inputting the OpenFlights identifier of a particular airline, the dataset will be searched for routes serviced by the airline. After all of the routes have been determined, a list of the airports that are used by these routes will be compiled. Other information, such as the distance between two airports is then calculated, and the result output into a single file to be used in Google Earth. An example of this function can be seen in figure 3. The airline used to generate this data is British Airways, OpenFlights ID 1355.

Figure 3 - The routes serviced by British Airways in the Caribbean.

In figure 3, the purple lines indicate a route between two airports. The colour of these routes is of no importance, randomly generated to allow multiple sets of results to be discriminated from one another. The pin-like icons, as seen previously in figure 2, represent a particular airport.

8|Page

Figure 4 - Information for the route between London Gatwick, United Kingdom and Norman Manley, Jamaica.

Seen in figure 4 is the effect of a user clicking on a particular route. As in the previous two figures, this is part of the results when the identifier for British Airways is used. Seen in bold LGW to KIN. are the IATA codes for the airports used by this particular route. The next line is the callsign of the airline that is using this route, SPEEDBIRD, followed by the distance of the route in kilometres.

Airport Density
This next function takes the OpenFlights identifier of a particular airport and creates a list of all routes that go to or from the airport. Duplicates of these routes (that are serviced by differing airlines) are then counted, giving the user an idea as to how popular, or, dense, a particular route is. An example of the use of this function can be seen in figure 5. In this case, the airport used was John F Kennedy International; its OpenFlights ID is 3797. For the most part, airports and routes are depicted similarly to the previous function discussed. The major difference here, however, is that as the number of airlines that service a route increases, so does the visual width of the route.

9|Page

Figure 5 - The airport density for John F Kennedy International, United States.

Airports by Country
A simple function; when given the name of a country, airports within that country will be displayed.

Figure 6 - A partial result of all airports in Australia

Conclusion In conclusion, the application creates results as expected, but is unusable by those who do not know how. It would thus benefit greatly from being re-implemented in another language with a user interface. 10 | P a g e

References
[1] http://en.wikipedia.org/wiki/MATLAB [2] http://www.google.com/earth/index.html [3] http://openflights.org/about [4] http://openflights.org/ [5] http://en.wikipedia.org/wiki/Microsoft_Excel_file_format [6] http://en.wikipedia.org/wiki/Comma-separated_values [7] http://code.google.com/p/googleearthtoolbox/ [8] http://code.google.com/apis/kml/documentation/

11 | P a g e

Вам также может понравиться