Вы находитесь на странице: 1из 17

INFO3063 Assignment #3

Winter 2007
Instructor: Michael Feeney

NOTE: This is a fairly “good enough for rock-n-roll” sort of a document. In other
words, imagine that it’s a verbal lecture or tutorial session, not some classic work of
literature. It’s intention is to lead you in the right direction, not win a Pulitzer.
• See an error? Let me know.
• Think it should be rewritten? Go for it! Then send me the updates.
• Missing something? What is it? Please let me know.

Reading Command Line Parameters:


One of the big parts of Project #3 is trying to read and make sense of the command line
parameters.

There are several parts to this:


• Getting the parameters themselves
o Dealing with the char arrays (“C” style strings)
o Getting “string” parameters
o Getting numeric parameters
• Trying to make sense of the parameters
o Decoding (or “parsing” – sort of)
o Checking for errors – wrong combinations
• Trying to get a variable list of parameters
o i.e. the variable list of files at the end.

First things, first. Let’s get them parameters!

Getting the parameters:

As you know, there are two forms of the main function in C++, the one we’ve used most
often ( int main() ) and the one that passes the command line parameters:

int main(int argc, char* argv[])

On of the “features” C/C++ arrays is that we don’t know how big they are. Yes, yes, I know
that if we use sizeof() in the same block as the array is defined, we can calculate the size,
but in this case, we didn’t declare the array, so we don’t know. argv is “declared” by the
operating system and passed to the our main function. We have no idea how big it is.

Page 1 of 17
Or do we? Well, we do. It’s exactly argc elements long, which is why that other parameter
is there. Gee, lucky that Brian Kernighan and Dennis Ritchie included that parameter, too
(http://cm.bell-labs.com/cm/cs/cbook/index.html)!!

Here’s a simple program that prints out any command line parameters that we pass into the
program:

#include <iostream>

using namespace std;

int main(int argc, char* argv[])


{
for (int x = 0; x < argc; x++)
{
cout << argv[x] << endl;
}
return 0;
}

Note that cout already “knows” how to deal with the character arrays contained in argv.
Pretty slick, huh?

Well, it may be “slick” and all, but if we want to do more than just print them out, we are
going to back away quickly form these “C” style arrays.

Some background on the “C string” (aka char arrays):

Long ago, in ancient times, when C king, the world was a different place. Even character
sets were a dubious place.

Heard of ASCII? It’s a code do define a number for each character of the alphabet. “A” is
65, “B” is 66, etc. Today, we know ASCII as the character codes for all English letters, and
know that ASCII is a 1 bytes standard. Really? Are you sure? Well, ASCII is actually a 7-bit
standard, not 8. So ASCII isn’t really 1 byte.

What? That’s madness, you say! Yes, but back then, even the size of a “byte” was being
debated.

Like I said, it was a different world.

This leads to a point, which is that “strings” were computer specific. So, the C language
didn’t have any strings as a built in type. The idea was that the specific computer would
have to deal with strings in their own way.

Page 2 of 17
The solution, at the time, was to declare a “string” as an array of characters - which is really
what a “string” is anyway – that ends with a “zero.” Now, I don’t mean the ASCII zero
character, but an actual zero. In C, this was the escape sequence ‘\0’.

So, knowing that, which one of these “strings” is really a “string” in C?

“Hey there, you with the stars in your eyes”

“Hey there, you with the stars in your eyes\0”

If you said the second one, give yourself a gold star!

The C language also included various libraries (like C++ has with iostream, etc.), that
allowed programmers to manipulate these “strings” in various ways.

For example, strlen() returned how many characters were in a “string.” It does this by
scanning through an array of characters, looking for that ‘\0’ character that’s at the end.

So, why change, you ask?

Well, imagine some of the “fun, fun, fun” that could be had based on the following:
• Arrays (a char array is an array, eh) can’t change size
o If your array is too small, too bad
o If you array is too big, space is wasted
• What if somebody forgot to place that ‘\0’ at the end?
• What if there is more than one ‘\0’ in the character array?

Now, it’s all fun to work with character arrays, and all, and a great deal of code uses
characters arrays, but, unless you really get off on stress and aggravation, it’s time to move
on.

For example, what if we want to compare some strings, like say, we want to know if
someone has typed something into the command line or not.

How about this?

int main(int argc, char* argv[])


{
if ( argv[1] == "hello")
{
cout << "entered hello" << endl;
}
// ... and so on ...

Well, gues what kids? It compiles! It runs! It’s amazing!

But, it doesn’t do what you want.

Page 3 of 17
What it is really doing is comparing the two arrays.

What “two arrays?” What are you talking about?

Well, there are two arrays, as far as the compiler is concerned. The first is argv[1] and
the second is… wait for it…. “hello”.

And, one “special” thing about arrays is that the first element (zero) is actually a “pointer” to
the array - In other words, it just a memory location.

So when we do a comparison like the code above, we are comparing if the memory
location for argv[1] is the same as the memory location for “hello”. Well they aren’t
as they are two separate arrays, so it will always return false.

Gee, that’s so helpful.

In the C language, we could compare them with the function strcmp():

int main(int argc, char* argv[])


{
cout << argv[1] << endl;
if ( strcmp(argv[1], "hello"))
{
cout << "entered hello" << endl;
}

Well, that’s not too bad, eh?

Well, there are two main problems, the first being that this doesn’t work.

Oh, it will compile, and run, but it won’t do what you expect.

strcmp() will return 0 if the strings are the same. The if statement assumes 0 is true,
so this means that combining the strcmp() with if means that it will return “true” if the
strings are not the same. Well, that makes sense.

So to get our code to work using a structure like that, we’d have to do something like this
(note the not - ! – in front of the strcmp() function):

int main(int argc, char* argv[])


{
cout << argv[1] << endl;
if ( ! strcmp(argv[1], "hello"))
{
cout << "entered hello" << endl;
}

Page 4 of 17
Remember, it was a different world. We weren’t all nuts or anything. It’s that you didn’t write
code a structure like that then.

The point (it’s about time) is that, while there are all sorts of functions for dealing with “C”
strings in C, we don’t want to do that. What we want it to get those suckers converted into
strings as soon as possible, then manipulate them like the nice strings we know now.

Now, knowing how much C stuff is out there, and knowing that C++ would have to deal with
all sorts of legacy C code, the string was written to be “nice” to C strings.

For example, have a look at this code:

int main(int argc, char* argv[])


{
string x = argv[1];
if (x == "hello")
{
cout << "Hello was entered" << endl;
}

Isn’t that nice? Notice that the string can be directly assigned the character array.

You could also use the .append() function:

int main(int argc, char* argv[])


{
string x;
x.append(argv[1]);
if (x == "hello")
{
cout << "Hello was entered" << endl;
}

Once the data is inside a string, all is well with the world.

Page 5 of 17
Aside: Command lines in Visual Studio 2005:

The command lines that you program sees are from when the user (or some other
program) enters them after the name of the program.

We can simulate this in a number of ways. For example, here is the properties window of a
program that enters some command line parameters (create a shortcut to a program – exe
– then get the properties of the shortcut to try it yourself):

Notice that the command parameters “hey” and “there” have been added. When the
program runs the command like parameters will be:

argv[0] = “CommandLine01.exe”
(NOTE: in windows, this would be the full path)
argv[1] = “hey”
argv[2] = “there”

Page 6 of 17
The other option would be to type them into the command windows. To do this, open a
command prompt, navigate to the location of the executable, then just type then in.

The problem that we won’t be able to easily debug doing either one of these.

Visual Studio allows us to enter command lines into our program and then do all the other
things we would normally do.

To get at this, get the properties of your project in the Solution Explorer window and locate
the Debugging tab. Enter whatever you’d like into the text box and those items will be
placed on the command line when the program is run (even in debug mode).

Getting numeric parameters:

If everything is a string, everything is all well with the world, but if there are some numeric
parameters in there, things are not do good.

For example, if one of our parameters is “18384.2”, we likely want this as a number, not a
string, so placing it into a string is not really a step forward.

Page 7 of 17
Keep in mind, though, that it was passed into the program as a character array, and leaving
it like that isn’t so hot, either.

There are two options, the “C” way and the “C++/stream” way.

I would recommend the “C++/stream” way only because that’s the way things are heading,
not because it’s simpler, because it has more steps.

The “C” way:

There are several functions that can convert character arrays to numbers. There is no one
“master” conversion function, and you have to know specifically what you are converting
into what or it won’t work.

The functions (which are in <stdlib.h> - note the “h” at the end) are:

• atoi() : stands for “ASCII to Integer”


• itoa() : stands for “Integer to ASCII”
• atof() : stands for “ASCII to floating point”
• fsvt() : converts floating point to ASCII

(Another note is that there are all sorts of variants to this, depending on if it’s an int or a
long, if the characters are char, w_char, or UNICODE, etc.)

Here’s how you might use them (this assumes that the command line was “132.3 283.3”):

int main(int argc, char* argv[])


{
double x = 0.0;
double y = 0.0;

x = atof(argv[1]);
y = atof(argv[2]);

cout << x + y << endl;

There are another few things to note here:


• If the conversion doesn’t work, it sets the value to zero. There is no way to know if
the number was zero or if it just didn’t work. I suppose you could check the original
string to see if it was “0”, but it might be that it was “0.0”, or “0.00”, or “00.0” or …
you get the idea
• The data types are very, very picky. If the string is w_char, not char, the compiler
won’t like it. It will also give you a compiler specific warning about converting the
values – some compilers assume it’s a float, others a double.

Page 8 of 17
The C++ way:

Since the amazing stream can interpret all sorts of thing correctly - did you ever wonder
how cin can “understand” a string vs a double when typed in? – why don’t we harness the
power of the stream object.

Why indeed.

Well, we can. There is a special type that combines the string with the basic stream,
called… wait for it… a stringstream.

While you “could” use a stringstream like a string (you could, but it would be a pain in the
neck as you will see, and pointless, as there is a string, eh), the real power comes from its
ability to behave like a stream.

Here is some example code (again, assuming that the command line was “132.3 283.3”):

#include <sstream>

using namespace std;

int main(int argc, char* argv[])


{
stringstream x;
stringstream y;

x << argv[1];
y << argv[2];

double x1 = 0.0;
double y1 = 0.0;

x >> x1;
y >> y1;

cout << x1 + y1 << endl;

There are a few things to note:

• See how the stringstream is initilized? It isn’t with the “=” like a regular string. Since
it’s a stream, it has to use the stream operators, namely << and >>. You can’t use
the “=” to assign values to the stringstream, any more than you could with cin, cout,
or the fstream objects. Note that you can use the “=”, but it won’t do what you would
expect.

Page 9 of 17
• The line x << argv[1] streams the second command argument into the
stringstream. Just like cin would do, the stringstream can accept pretty much
anything you throw at it, so passing a character array is nothing the stringstream
can’t handle.
• The real “power” comes a few lines later, when we stream the information back out.
In the line x >> x1 , the stringstream “knows” that we are trying to place its
contents into a double, so it performs the appropriate conversions. This may not look
like much, but remember that we could have typed almost any valid double value at
the command line and it would still “handle” it without any problems.
• Unlike the “C” functions (atoi, atof, etc.), the stringstream will raise an “exception” if
something is wrong. We will look at exceptions later in the class, but this is Big Deal
if the data we are getting in is not what we expect. Remember that the “C” functions
simply convert the number to a zero if the conversion fails, the problem being that
we have to do some fancy footwork to know if the number really is a zero, or
something went wrong.

Trying to make sense of the parameters

Now that we can get strings and numbers in from the command line, we have to try and
make sense of them.

In project #3, there are a number of combinations of commands that the user can enter.

At first, there seems to be an overwhelming number of combinations, but there isn’t.

If there were an overwhelming number of combinations (imagine writing something like a


compiler, for instance), then we’d need a full-blown parser and “tokenizer” etc.

In our case, the commands are quite structured and we can expect one of the following
conditions:

• The commands follow a set sequence or…


• The user screwed up, in which case we can quit.

Page 10 of 17
For example, one entity that is blatantly missing from project #3 is types of foods. Let’s
assume that we want to work with various types of foods, defined as follows:

Entity type Properties Data types Enumeration (if applicable)


Food Name string No “white space” in name. i.e.
“Green_beans” is allowed, but
“Green beans” is not allowed
Type string Based on the four food groups:
“Grain”, “Dairy”, “Protein”,
“Vegetable”
PortionSize double Always positive
Portions int Always positive

This means, that we could expect some of the following commands:

myprog sort ascending Name screen myfoods.dat

myprog sort descending PortionSize file output.txt


myfoods.dat

myprog search Name Beans screen myfoods.info

myprog search Portions 5 file fiveportions.txt


somefood.dat morefood.dat food3.dat

Now, at first, there seems to be an whole whack of combinations. How can my program
deal with all of those?!!

The secret is to break it down step by step.

Firstly, notice that the smallest command line combination would be something like the first
example, which has five (5) parameters + the name of the program, so six (6) command
line parameters.

If the program starts up and there are fewer than six (6) items passed, we know there is a
problem, so we can bail.

But how do we know? Well, that’s what the argc value is for! Have a look at the following:

int main(int argc, char* argv[])


{
if (argc < 6)
{
cout << "Hello? Read the manual." << endl;

Page 11 of 17
Secondly, notice that the first argument (after the name of the program) is either “sort” or
“search”. Anything else is an error.

Knowing that, consider the following code:

int main(int argc, char* argv[])


{
string param1 = argv[1];

if (param1 == "sort")
{ // do sort
cout << "You want to sort..." << endl;
}
else if (param1 == "search")
{ // do the search
cout << "You want to search..." << endl;
}
else
{ // Duh.
cout << "Error: You don't know what you want." << endl;
}

On a side note, why couldn’t we use the switch to check the command line choices? Think
about it.

So that deals with the first two parameters (the name of the program is always the first
parameter – argv[0]), now what?

If the user picked “search”, then the next parameter has to be one of the properties of the
entity you picked. In this example, using foods, the parameter has to be one of Name,
Type, PortionSize, or Portions. If they enter anything else, they’ve made an error.

If the user picked “sort”, then the next parameter has to be “ascending” or “descending”.
Anything else is also an error.

Also, if they picked “sort”, then the next parameter after that is one of the properties of your
entity (i.e. same code structure as if they picked “search”, eh?).

Here’s an example:

Page 12 of 17
int main(int argc, char* argv[])
{
string param1 = argv[1]; string param2 = argv[2];
string param3 = argv[3];

if (param1 == "sort")
{ // Is sort valid?
if ((param2 == "ascending") || (param2 == "descending"))
{ // appropriate propterty?
if ((param3 == "Name") || (param3 == "Type") ||
(param3 == "PortionSize") ||
(param3 == "Portions"))
{
cout << "Command is valid" << endl;
// TODO: Amazing code here
}
else
{
cout << "ERROR: Invalid property." << endl;
return -1;
}
}
else
{
cout << "ERROR: Read the manual, bub." << endl;
return -1;
}
}
else if (param1 == "search")
{ // do the search - property OK
if ((param3 == "Name") || (param3 == "Type") ||
(param3 == "PortionSize") || (param3 == "Portions"))
{
cout << "Command is valid" << endl;
// TODO: Amazing code here
}
else
{
cout << "ERROR: Invalid property." << endl;
return -1;
} }
else
{ // Duh.
cout << "Error: You don't know what you want." << endl;
}

Page 13 of 17
So now the first four or five parameters are dealth with.

Note that this code is a bit of a mess. There is nothing “wrong” with that, not notice how the
test for the various properties is coded twice?

That just screams out for a function, doesn’t it. That’s a rhetorical question, with the answer
being “yes,” by the way.

Assuming we have the following functions:

bool ValidProperty(string propertyName)


{
if ((propertyName == "Name") ||
(propertyName == "Type") ||
(propertyName == "PortionSize") ||
(propertyName == "Portions"))
{
return true;
}
return false;
}

We can shorten this a bit:

Page 14 of 17
int main(int argc, char* argv[])
{
string param1 = argv[1];
string param2 = argv[2];
string param3 = argv[3];

if (param1 == "sort")
{ // do sort
// Is sort valid?
if ((param2 == "ascending") || (param2 == "descending"))
{ // appropriate propterty?
if (ValidProperty(param3))
{
cout << "Command is valid" << endl;
// TODO: Amazing code here
}
else
{
cout << "ERROR: Invalid property." << endl;
return -1;
}
}
else
{
cout << "ERROR: Read the manual, bub." << endl;
return -1;
}
}
else if ((param1 == "search") && ValidProperty(param3))
{ // do the search - property OK
cout << "Command is valid" << endl;
}
else
{ // Duh.
cout << "Error: You don't know what you want." << endl;
}

The next parameter, if the command is “search” is the value you are looking for.

The snag here is that it could be a string or a number, depending on what type of thing we
are looking for.

Now, if we knew what we are looking for, it simplifies the conversion greatly.

WAIT A MINUTE! We do know!

Page 15 of 17
How do we know?

Well we know the possible parameters (and if you have the equivalent code so far, we
know that they are valid), and we know the types of things that have to be entered.

In our case, if they enter “Name” or “Type”, the item has to be a string. What’s more, it can
only be a single word (with no white space). If they enter “PortionSize”, it has to be a
double, and if they enter “Portions”, it has to be an int.

Simple. If we only knew how to convert the command parameter to a number….

…but we do (see all the code at the top that you skipped over or forgot).

Screen or file?

The next parameter (after the property value if we are doing a “search” or after the property
name if we are doing a “sort”) is either “screen” or “file”.

Anything else is an error.

See the code for “search” and “sort” – the very first parameter we checked – for hints on
how to deal with this.

Now, if they picked “screen”, ALL the next parameters are the data files.

If they picked “file”, the next parameter is the INPUT file name, followed by ALL the input
files.

So after we “parse” (that’s what we’ve been doing, by the way) all the command line
parameters up to this point, all the rest of the items are the input file names.

But how do we know how many there are? Well, there’s that argc parameter. If we had kept
track of which parameter we have read (and that’s not too difficult as we are working
through them one or two at a time) we know how many parameters are left.

Note that, depending on the “branch” of the if statements, we will have read a different
number of parameters, so watch this.

For example, suppose we are given this command:

myprog search Portions 5 file fiveportions.txt


somefood.dat morefood.dat food3.dat

The first input file starts at argument six (6) – i.e. argv[6] is “somefood.dat”.

Page 16 of 17
If we have parsed through the command up to this point, we know we’ve read five (5)
words from the command line (note that the number “5” would have been passed as a
character array from the operating system – we’d have to convert it to a 5 on our own – it
wouldn’t change the fact that it’s counted as one of the items passed).

How many file names are left? It would be argc – 5.

Or, you could loop through the remaining parameters with a for loop that started at 6 (in this
case) and ended at argc. Slick.

How do you read all the files in?

You could make a function that read the information into your vector, based on a file name
being passed to it. Something like this would be good:

void ReadDataFromFile(vector<Food> &theVec, string filename);

Notice that the vector is being passed by reference so that we can load more data onto the
vector each time we call the function. The files are all in the same format, we are simply
reading some data from each file, combining all of it into one big vector inside out program.

Assuming that the first parameter is saved in the “startingParam” variable, here’s a small
big of code that would load all the data from all the remaining command line parameters
(assuming they are all valid files, etc.):

int startingParam = 6;
for (int x = startingParam; x != argc; x++)
{
// Get the file name...
string daFileName = argv[x];
ReadDataFromFile(myFoodVec, daFileName);
}

When this bit of code is done – and assuming that the ReadDataFromFile opens the
file and places the data into the vector – the vector would contain ALL the data from ALL
the files.

And now, the remaining command line parameters...

WAIT! We’re all done. There aren’t any more command line parameters.

Sweet.

All we have to do now, is the actual searching, sorting, and printing the data out to the
screen or file.

Page 17 of 17

Вам также может понравиться