Martes, Oktubre 7, 2014

Programming Language for Information Processing

A. Sample Programming Problems


   Whenever I’m TA for a introductory CS class where students learn some programming language, I have trouble coming up with good exercises. Problems from Project Euler and the like are usually much too difficult for beginners, especially if they don’t have a strong background in mathematics.

   This page is a collection of progressively more difficult exercises that are suitable for people who just started learning. It will be extended as I come up with new exercises. Except for the GUI questions, exercises are generally algorithmic and should be solvable without learning any libraries. The difficulty of the exercises of course somewhat depends on the programming language you use. The List exercises for example are more complicated in languages like C that don’t have build-in support for lists.
I suppose they are also useful, although much easier, whenever an experienced person wants to learn a new language.

Intermediate
  1. Write a program that outputs all possibilities to put + or - or nothing between the numbers 1,2,…,9 (in this order) such that the result is 100. For example 1 + 2 + 3 - 4 + 5 + 6 + 78 + 9 = 100.
  2. Write a program that takes the duration of a year (in fractional days) for an imaginary planet as an input and produces a leap-year rule that minimizes the difference to the planet’s solar year.
  3. Implement a datastructure for graphs that allows modification (insertion, deletion). It should be possible to store values at edges and nodes. It might be easiest to use a dictionary of (node, edgelist) to do this.
  4. Write a function that generates a DOT representation of a graph.
  5. Write a program that automatically generates essays for you.
    1. Using a sample text, create a directed (multi-)graph where the words of a text are nodes and there is a directed edge between u and v if u is followed by v in your sample text. Multiple occurrences lead to multiple edges.
    2. Do a random walk on this graph: Starting from an arbitrary node choose a random successor. If no successor exists, choose another random node.
  6. Write a program that automatically converts English text to Morse code and vice versa.
  7. Write a program that finds the longest palindromic substring of a given string. Try to be as efficient as possible!
 Advanced
  1. Given two strings, write a program that efficiently finds the longest common subsequence.
  2. Given an array with numbers, write a program that efficiently answers queries of the form: “Which is the nearest larger value for the number at position i?”, where distance is the difference in array indices. For example in the array [1,4,3,2,5,7], the nearest larger value for 4 is 5. After linear time preprocessing you should be able to answer queries in constant time.
  3. Given two strings, write a program that outputs the shortest sequence of character insertions and deletions that turn one string into the other.
  4. Write a function that multiplies two matrices together. Make it as efficient as you can and compare the performance to a polished linear algebra library for your language. You might want to read about Strassen’s algorithm and the effects CPU caches have. Try out different matrix layouts and see what happens.
  5. Given a set of d-dimensional rectangular boxes, write a program that computes the volume of their union. Start with 2D and work your way up.

B. C language elements


    The C Character Set
C uses the uppercase English alphabets A to Z, the lowercase letters a to z, the digits 0 to 9, and certain special characters as building blocks to form basic program elements viz. constants, variables, operators, expressions and statements.
The special characters are listed below:
!*+\"<
#(=|{>
%)~;}/
^-[:,?
&-]'.(blank)

Most versions of the language also allow some other characters, such as @ and $, to be included within strings and comments.
In addition certain combinations of these characters, such as '\b', '\n' and '\t', are used to represent special condition such as backspace, newline and horizontal tab, respectively. These character combinations are known as escape sequences.

   Identifiers and Keywords
Identifiers are names given to various items in the program, such as variables, functions and arrays. An identifier consists of letters and digits, in any order, except that the first character must be a letter. Both upper and lowercase letters are permitted. Upper and lowercase letters are however not interchangeable (i.e., an uppercase letter is not equivalent to the corresponding lowercase letter). The underscore character (_) can also be included, and it is treated as a letter. Keywords like if, else, int, float, etc., have special meaning and they cannot be used as identifier names.
The following are examples of valid identifier names: A, ab123, velocity, stud_name, circumference, Average, TOTAL
The following names are not valid identifiers:
1st- the first character must be a letter
"Jamshedpur"- illegal characters (")
stud-name- illegal character (-)
stud name- illegal character (blank space)

Although an identifier can be arbitrarily long, most implementations recognize typically 31 characters. There are some implementations, which recognize only eight characters. The ANSI standard recognizes 31 characters. If a system recognizes only 8 characters and if you use a variable name with more than 8 characters, only the first 8 characters will be taken, the rest will be ignored. Thus, average_of_numbers and average_ will both be recognized as average_. It is therefore safer to try and use short meaningful names for your identifiers.

Exercise:
Identify which of the following are valid identifiers.
(a)stud 1(d)switch(g) Average of Numbers(j) 123-45-678
(b)1stud(e)%calc(h) Average_of_Numbers
(c)stud_1(f)_x(i) Average-of-Numbers

Data Types and Sizes

There are only few basic data types in C. These are listed in the table below:
Data typeDescriptionSizeRange
charsingle character1 byte 0 - 255
intinteger number4 bytes-2147483648 to 2147483647
floatsingle precision floating point number (number containing fraction & or an exponent)4 bytes3.4E-38 to 3.4E+38
doubledouble precision floating point number8 bytes1.7E-308 to 1.7E+308
The list of data types can be increased by using the data type qualifiers such as - short, long, and unsigned. For example, an integer quantity can be defined as long, short, signed or unsigned integer. The memory requirement of an integer data varies depending on the compilers used. The qualified basic data types and their sizes are shown in table below.
Data typeSizeRange
short int2 bytes-32768 to 32767
long int4 bytes-2147483648 to 2147483647
unsigned short int2 bytes0 to 65535
unsigned int4 bytes0 to 4294967295
unsigned long int4 bytes0 to 4294967295
long double (extended precision)8 bytes1.7E-308 to1.7E+308

Note that the qualifier unsigned can be used along with the qualifiers short and long. The unsigned int occupies the same memory space as an ordinary int but differs on the possible content of the left-most bit. In case of an ordinary int it is reserved for sign (sign bit), but all the bits in an unsigned int are used for determining the magnitude of the integer quantity.
The char type is used to represent individual characters, and occupies one byte of memory. Each character has an equivalent integer representation (since each stores internally the ASCII value for the corresponding character). So char variables or constants can be used as integer data in arithmetic expressions.
The data objects to be manipulated in a C program are classified as variables and constants. The type of all the variables to be used in the program must be declared before they can be used. The operations that can be performed on the data objects are specified by a set of operators. Expressions used in a program combine the variables, constants and operators to produce new values.

Constants

The constants in C can be classified into four categories namely integer constants, floating point constants, character constants and string constants.
A character constant is written as for example - 'A' (always enclosed in single quotes).
Examples of string constants are - "Jamshedpur", "A", etc. Note that a string constant is always enclosed within double quotes.
A normal integer constant is written as 1234.
A long int is recognized by the presence of L (uppercase or lowercase) at the end of the constant, e.g. 2748723L.
The suffix u or U signifies the int to be an unsigned one.
The UL or ul at the end indicates the int quantity is of unsigned long type
Floating point constants contain a decimal point (167.903) or an exponent (1e-2) or both. Their type is double unless suffixed. The suffix of f or F indicates float; 1 or L indicates long double.
C also supports octal and hexadecimal data. The value of an integer data can be specified in either octal or hexadecimal form. A hexadecimal constant must begin with 0x or 0X, a leading 0 indicates the octal representation. Octal and hexadecimal constants may also be followed by U to indicate unsigned or L to determine long.
The number 0x2A5 is an example of a hexadecimal number. Internally the number is represented by the following bit patterns,
0x2A5 = 0010 1010 0101 = 2 * 162 + 10 * 161 + 5 * 160 = 677
        ---- ---- ----
          2    A   5
The number 677 is the decimal equivalent of the number 0x2A5.

Example of an octal number can be 0347. To represent each digit of an octal number in the form of binary, we need maximum of three bits since the last digit in the octal number system is 7.
0347 = 011 100 111 = 3 * 82 + 4 * 81 + 7 * 80 = 231 (in decimal)
       --- --- ---
        3   4   7

In numeric constants e.g. integer or floating point constants, blanks and any non-numeric characters cannot be included. The range of these constants will be limited by the maximum and minimum bounds (usually machine dependent).
A character constant is a single character enclosed in apostrophes such as 'A'. Such a constant is internally treated as an integer e.g. 'A' corresponds to the integer 65 (also known as its ASCII value). The ASCII value of 'a' (small) is 97. Hence character constants can participate in a numeric calculation just like any other integer, moreover, they also can be used for comparing with other character constants. Some character constants are of non-printing type which can be expressed by a combination of a back-slash (\) and a character. They are known as escape sequences. Each of them represents a single character even though they are written in terms of two or more characters.
Commonly used escape sequences are listed below:
CharacterEscape SequenceASCII Value
Bell\a007
Backspace\b008
Null\0000
Newline\n010
Carriage return\r013
Vertical tab\v011
Horizontal tab\t009
Form feed\f012

Exercises:
Identify which of the following numerical values are valid constants. If a constant is valid, specify whether it is integer or real. Also, specify the base for each valid integer constant.
(a)0.7b)39,634c)16.3e18d)16.3e-18
(e)123456789f)123456789lg)0.3E+0.4h)0.3E4
(i)0412j)018ACFk)0xABCDEl)0x97e334

Which of the following are valid character constants?
(a)'a'(b)'/n'(c)'F'(d)'\0492'
(e)'x'(f)'\\'(g)'\0'(h)'\n'
(i)'\b'(j)'\x-y-'




Which of the following are valid string constants?
(a)'9:25 A.M.'
(b)"Blue, Green and Indigo"
(c)"Name:
(d)"Section 4 (Next \'d')"
(e)"1.6e-18"
(f)"JAMSHEDPUR BR 831 001"
(g)"The Station master announced, "Down Geetanjali express is running late"

Symbolic Constants

Constants which play crucial roles in a program can be made more meaningful by assigning them appropriate names to make the program more readable and easily changeable. These constants are called symbolic constants and are defined as follows.
Examples:
#define PI 3.141593
#define TRUE 1
#define PROMPT "Enter Your Name :"
PI, TRUE and PROMPT are symbolic constants, so they do not appear in the declaration. #define is a preprocessor directive like #include. The salient feature of a symbolic constant is that the value assigned to a symbolic constant cannot be changed (unlike variables) subsequently in the program.
Exercise:
Write an appropriate definition for each of the following symbolic constants, as it would appear within a C program:

ConstantText
(a)AMOUNT-75
(b)EPSILON0.00001
(c)BIGIN{

END}
(d)FIRSTNAME"Sanjay"
(e)ENDOLIN'\n'
(f)COST"Rs. 125.75"

Variable Declaration in C

In a C program all variables must be declared before they are used. A declaration determines the type of the data, and contains a list of one or more variables having the same type.
Example:
intcount, index;
charflag, text[80];
short inta,b;
unsigned intp;
doubled;

A variable can be initialized with values at the time of declaration.
Example:
intc = 5;
charreply = 'Y';
doubled = 4.64386237445675;
charstate[] = "ANDHRA PRADESH";
floateps = 1.0e-5;

Exercises:
Write appropriate declaration for each group of variables and character array (strings):
(a)Integer variables: x, y
(b)Integer variable: count

Unsigned integer variable: employee no

Double-precision variables: net sales, tax, profit
(c)Character variables: first name, last name
(d)70 element character array: message

Write appropriate declaration and assign the given initial values to each group of variables and array:
(a)Floating-point variables: x = -9.5, y = 1.0004

Integer variables: a = 734, b = 49, c = -63

Character variables: c1 = 'a', c2 = '$'
(b)Double-precision variable: d1 = 3.94 * 10-12, d2 = -9.89 * 107

Integer variable: i = 437 (octal), h = 6AFF (hexadecimal)
(c)Long integer variable: large = 123456789

Double-precision variables: d = 0.6666666

Character variable: eol = newline character
(d)One-dimensional character array: message = "OVERFLOW"






 

Walang komento:

Mag-post ng isang Komento