Programming Stuff: November 2014

Tuesday, November 25, 2014

Java:

Changed PythagoreanSpeed program so that it recorded run time in nanoseconds rather than milliseconds. Tweaked the output slightly
I set it to do 100 passes at levels 50, 100, 500, and 1000:

Timings:
=======(n = 50)=======
Strategy UMD solution: 0.1177920396039604 ms.
Strategy UMD solution with Math.pow: 0.7689506237623762 ms.
Strategy UMD solution with pre-squared array: 0.1406431188118812 ms.
Strategy pre-squared with hash: 0.23302989108910893 ms.
=======(n = 100)=======
Strategy UMD solution: 0.2192501683168317 ms.
Strategy UMD solution with Math.pow: 1.022717900990099 ms.
Strategy UMD solution with pre-squared array: 0.15266823762376236 ms.
Strategy pre-squared with hash: 0.21271922772277227 ms.
=======(n = 500)=======
Strategy UMD solution: 17.247952891089106 ms.
Strategy UMD solution with Math.pow: 126.09220794059407 ms.
Strategy UMD solution with pre-squared array: 12.931144465346534 ms.
Strategy pre-squared with hash: 1.3764326336633663 ms.
=======(n = 1000)=======
Strategy UMD solution: 132.50848051485147 ms.
Strategy UMD solution with Math.pow: 1036.010291940594 ms.
Strategy UMD solution with pre-squared array: 100.51113749504951 ms.
Strategy pre-squared with hash: 5.131068326732674 ms.

I didn't add any code to round the output, so past the hundreds place or so those digits are probably not significant
Pushed changes to github
I then added an additional algorithm which is O(n²log(n)). All other conditions were kept the same.
Github link to algorithm

Timings:
=======(n = 50)=======
Strategy UMD solution: 0.12460728712871287 ms.
Strategy UMD solution with Math.pow: 0.8824542277227723 ms.
Strategy UMD solution with pre-squared array: 0.09626874257425742 ms.
Strategy pre-squared with hash: 0.24894374257425742 ms.
Strategy pre-squared with binary search: 0.14212682178217823 ms.
=======(n = 100)=======
Strategy UMD solution: 0.24023814851485148 ms.
Strategy UMD solution with Math.pow: 1.1169474059405942 ms.
Strategy UMD solution with pre-squared array: 0.23458126732673265 ms.
Strategy pre-squared with hash: 0.2551629306930693 ms.
Strategy pre-squared with binary search: 0.21795613861386137 ms.
=======(n = 500)=======
Strategy UMD solution: 17.942774574257427 ms.
Strategy UMD solution with Math.pow: 126.00360138613861 ms.
Strategy UMD solution with pre-squared array: 13.263654326732674 ms.
Strategy pre-squared with hash: 1.4827347524752474 ms.
Strategy pre-squared with binary search: 5.845997277227723 ms.
=======(n = 1000)=======
Strategy UMD solution: 132.98607085148515 ms.
Strategy UMD solution with Math.pow: 1002.6387993267326 ms.
Strategy UMD solution with pre-squared array: 100.78688199009902 ms.
Strategy pre-squared with hash: 5.523412445544555 ms.
Strategy pre-squared with binary search: 32.41068194059406 ms.

It's interesting that the binary search algorithm is actually faster at low values of n, likely because a hash of the values doesn't need to be created.

Monday, November 24, 2014

Stuff:

Replied to Mr. Elkner's comment about infinite meta refresh
Read Kevin's email, he suggested that I improve the timings program by adding multiple values for n. I was originally going to do values of {100, 1000, 10000, 100000}, but some of the algorithms seem to take a very long time values of n past 1000 so I'm not sure if I should figure out which ones run in a reasonable amount of time at large N values and run those, or just cut off the test after like 2,000 or so.
To keep run-time down, I only did three passes for each level, so the average values aren't as accurate as they could be.

Timings:
Strategy UMD solution with n 100: 2 ms.
Strategy UMD solution with Math.pow with n 100: 17 ms.
Strategy UMD solution with pre-squared array with n 100: 1 ms.
Strategy pre-squared with hash with n 100: 2 ms.
======
Strategy UMD solution with n 1000: 103 ms.
Strategy UMD solution with Math.pow with n 1000: 753 ms.
Strategy UMD solution with pre-squared array with n 1000: 70 ms.
Strategy pre-squared with hash with n 1000: 13 ms.
======
Strategy UMD solution with n 2000: 797 ms.
Strategy UMD solution with Math.pow with n 2000: 6051 ms.
Strategy UMD solution with pre-squared array with n 2000: 637 ms.
Strategy pre-squared with hash with n 2000: 21 ms.
======

I realized that these numbers were being averaged as longs, and changed it so that they would be divided as doubles. I re-ran the test:

Timings:
Strategy UMD solution with n 100: 2.75 ms.
Strategy UMD solution with Math.pow with n 100: 19.75 ms.
Strategy UMD solution with pre-squared array with n 100: 1.75 ms.
Strategy pre-squared with hash with n 100: 1.75 ms.
======
Strategy UMD solution with n 1000: 94.25 ms.
Strategy UMD solution with Math.pow with n 1000: 754.5 ms.
Strategy UMD solution with pre-squared array with n 1000: 70.75 ms.
Strategy pre-squared with hash with n 1000: 15.0 ms.
======
Strategy UMD solution with n 2000: 725.75 ms.
Strategy UMD solution with Math.pow with n 2000: 6063.0 ms.
Strategy UMD solution with pre-squared array with n 2000: 559.5 ms.
Strategy pre-squared with hash with n 2000: 19.0 ms.
======

Ruby:

Installed Ruby on Rails to my Ubuntu VM

EIMACS:

Read more about selection sort

Thursday, November 20, 2014

Web design:

Migrated website from students.gctaa.net/~ahirschberg/ to ahirschberg.github.io
I also accidentally created a redirect loop (now fixed), and I was surprised that chrome didn't warn me at all.

Wednesday, November 19, 2014

Java:

Updated weekend blog post about Pythagorean Timings, uploaded timing source code to github
Started learning about sorts

Website:

Set up ahirschberg.github.io to redirect to my students website
Learned about using meta refresh to redirect: code

Tuesday, November 18, 2014

EIMACS:

Took EIMACS test 18, which was on binary and sequential sorts
I'm finding that the tests put a lot of focus on being able to work through these obscure, pointless algorithms for no clear reason other than to jump through their hoop to get the answer.
For example:
What is the value returned by the line newsearch( new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }, 9 )?

  public static int newsearch( int[] a, int t ) 
  { 
    int n = (a.length - 1) / 2; 
    int k = 0; 
    while ( true ) 
    { 
      for ( int j = 0 ; j < a.length ; j += n ) 
      { 
        k++; 
        if ( t == a[ j ] ) 
        { 
          return k; 
        } 
      } 
      n--; 
    } 
  }

And my approach for solving it:
What does this even do? Why would you ever want to do this?? They make these complex algorithms for no reason! The only thing they are testing is my patience by making me work through each iteration, not my knowledge of searches.

Monday, November 17, 2014

Changes 11/17/14

I didn't like this:

public boolean equals( Object o )

  {

    return compareTo( (Item)o ) == 0;

  }

Strictly speaking, by overriding the equals method in this way we are violating the general contract for equals, which is required never to throw an exception. The usual way to avoid this uses theinstanceof operator (which is not part of the Advanced Placement Java subset):

  public boolean equals( Object o )

  {

    return ( o instanceof Item ) &&

             ( compareTo( (Item)o ) == 0 );

  }

EIMACS:

I don't like this because you would never, ever want to use the first implementation of equals. Even though they make a note of this, it shows that the philosophy for their curriculum is out of touch with reality.
Since the AB exam is no longer a thing, eimacs really needs to update their curriculum to stop focusing on the Collegeboard's curriculum. Operators like instanceof are important in Java, and acting like they don't exist is just plain stupid. I wish they would just teach instanceof once so they don't have to make a special case every time instanceof is necessary for solving a problem.

Learned more about binary search, and wrote my own, which was kind of tough because I knew roughly what it should look like but forgot what the midpoint should be set to on each iteration.
I had mid = (high - low) / 2, when it should actually be low + (high - low) / 2, or (high + low) / 2
Learned that a binary search runs in O( log₂n )

Changes Weekend 11/14 to 11/16

Java:

Fixed the problems I was having with getting different counts in different Pythagorean Triples algorithms run on the same data set. What I realized was that the non-hash algorithms would count a Pythagorean Triple for each valid C value if there was more than one, while the hashed algorithm would not. For example, in the set {3,4,5,5}, the O(n³) algorithms would find 2 triples in this set, while the hashed algorithm would find one. To remedy this, I set the O(n³) algorithms to break once a C value had been found.
Created a program that tests the average run times for various implementations of the Pythagorean Triples program with n=1000

Timings:
Strategy UMD solution: 115 ms.
Strategy UMD solution with Math.pow: 984 ms.
Strategy UMD solution with pre-squared array: 78 ms.
Strategy pre-squared with hash: 6 ms.

The accuracy of this measurement could probably be improved if I had used something more accurate than System.currentTimeMillis(), but as it is it illustrates how much faster using a hash is than any other implementation.

Web Design:

I made a website over the weekend: tommywiseau.github.io
It's a tribute to the movie The Room, written and directed by Tommy Wiseau
It's essentially like a soundboard, but for videos. Left clicking one of the video buttons will play the video, and right clicking will toggle looping on the video.
If you click Tommy Wiseau's face in the top left, it plays a recording of one of the many times Tommy Wiseau laughs awkwardly.
If you don't like hearing bad words, make sure to click the "toggle NSFW" button!
Additionally, if you want to be taken on a magical cinematic experience, press the "Play all" button.

Friday, November 14, 2014

Changes 11/14/14

UMD Programming Contest / Java:

At NSF Kevin and I started working on a way to benchmark the efficiency of the various algorithms used to find Pythagorean triples.
We are testing UMD's base solution, their solution with Math.pow used to square the arrays, their solution with a pre-squared array, and Kevin's solution with a pre-squared array and hash.
The program runs, but Kevin's solution finds a different number of triples than the other three strategies. For example: count 1619 did not equal lastCount 1454 for strategy: pre-squared with hash
I'm not sure the best way to go about debugging this, I'll work on it on Monday.

Changes 11/13/14

Java:

Took eimacs test 17, which was on inheritance (abstract classes and interfaces)
Reviewed sequential and binary search algorithms

Wednesday, November 12, 2014

Changes 11/12/14

UMD Programming Contest:

Uploaded Vigenere Cipher problem to github
It took me a while to figure out because I was having trouble with a certain line.
The correct implementation is:

sb.append((char)('A' + ((keyChar + plainChar) % 26)));

At first, I didn't quite understand what I was doing. Sam and I had gone through the problem together, but I didn't fully understand his solution, so I decided to work on it on my own. I was pretty good on everything else in the program, but couldn't quite remember what this line had done in his program.
Basically what this line does is take a character from a target string and offset it by the distance between a character in a "key" string and the letter A, which is the fundamental of a Vigenere Cipher.
I kept thinking that the keyChar should be subtracted from the plainChar, because I felt like they should be a range. Obviously, that wasn't correct, and it threw me through a loop because I couldn't figure out what I was doing wrong
Eventually I decided to work through the example they gave in the problem description while also looking at my code, which is how I figured out what I was doing wrong
Doing this problem also made me realize I need to get more experience with the modulus operator (%).
Sam used them really well in his implementation to loop the index for the character we were looking at in the key string and making sure that the offset for the character loops back around when it happens to offset past Z.

Changes 11/11/14

UMD Contest:

Worked with Sam on the Vigenere Cipher problem (number 3)
He got it almost right away, but I'm still working on my implementation, because I'm having trouble debugging it.

Friday, November 7, 2014

Changes 11/7/14

UMD Programming Contest:

Had a conversation with Kevin about more efficient ways to fulfill the functionality requested by the Pythagorean triples problem from the contest.
He suggested converting the array to a hash, using two for loops in order to identify a and b values, then checking if the corresponding c value was present in the hash.
Since getting a specific element in a hash table is an O( 1 ) operation (although I haven't learned that yet), his solution would bring this down to O( n² ).
He also suggested pre-sorting the array and pre-squaring all the values, which would bring performance benefits for large values of n
Looked at UMD's solution to the Pythagorean triples problem
I guess I'm not quite clear on the design philosophy behind this contest. Are we supposed to have the simplest solution, the most elegant, the fastest? Can I pull in more advanced constructs like Hashes, or am I stuck with only the default imports of Java. In their solution for this problem, they used Arrays.sort(), but did not use Math.pow().

Thursday, November 6, 2014

Changes 11/6/14

UMD Programming Contest:

Realized I had made an incorrect assumption about the data. The first number of each set to check (such as in the table at the bottom of page 4 of the pdf) appears to be number of integers on that line, and not actually a number to be checked for triple-ness.
This had also caused me to make the incorrect assumption that each number was to be counted only once, which clearly isn't true, because the line 5 13 12 3 4 5 is expected to output both {5 12 13} and {3 4 5}. Since the first five in that set is the number of integers, the second five has to be counted twice.
After making this assumption, I then wasted valuable time making the program set values it had identified were Pythagorean triples to -1 and adding integer > 0 checks to all the for loops so that values of -1 would be ignored.
I also feel like the designers were very lazy in their setup of the environment, which leads to problems like this where they have hardcoded the expected length of the line into the file without being explicit about what they are doing.
To be fair, they did say:
"The first line in the test data file contains the number of test cases (< 100). After that, each line contains one test case: the first number is n (3 ≤ n ≤ 50), the size of the sequence, and the following n numbers are the sequence (not necessarily in any order)"
However, that's not worded very intuitively, and I either missed that completely or just didn't understand what they were saying. One of the lessons I'm taking away from this problem in particular is to read the prompt and explanation very carefully.
I eventually got an implementation that worked, although it runs in O( n³ ) since there are three nested for loops.
Even in victory, I feel defeated, especially because I had to cheat a little and googled the best way to find the b value in a² + b² = c², where a < b < c.
I found a really simple solution to finding this intermediate value on stackoverflow, which was to add a+b+c and subtract the min and max values, leaving the only value that wasn't an extreme. I wish I could have thought of this on my own, but I just don't have enough practice with this sort of thing yet.

Changes 11/5/14

UMD Programming Contest:

Wow today was horrible
The enviroment they have set up for the programming contest makes it almost impossible to debug your code. They set System.out to output to a file, so if for whatever reason your program doesn't write the file, you have no way to find out why. I ran into a problem where my for loops were causing endless execution, but the program gave no indication of this. I checked in a normal java environment, and it turns out that the code:

int i = 0;
  for (int j = 0; j < 5; i++)
    System.out.print("");

does not give any compiler warnings, even though the compiler should be able to detect that this is extremely likely to be an unintended infinite loop. This is the first problem I ran into in my code, and it took me almost 45 minutes to discover the source since I wasn't able to print debug output!
I also just don't have a head for this sort of thing. The problem I'm currently working on is problem number 2 from the 2012 contest, which tasks you with identifying Pythagorean triples from an array of integers.
In my current strategy, I've got for loops nested three deep, one to identify each of the three numbers in the formula a² + b² = c². There is probably a better way of finding triples, but I have no idea what it would be.

Monday, November 3, 2014

Changes 11/3/14

UMD Programming Contest:

Wasted about 45 minutes trying to import the pythagorean triples project into IntelliJ. I eventually got it by selecting "Import Project" > "Create Project From Existing Sources". Previously I had been selecting "Import Project from External Model", which wouldn't be set up correctly.
Eventually I'll switch over to Eclipse
Working on the Pythagorean Triples problem
I'm sure there's plenty of resources on finding Pythagorean triples online, but I don't think I would be allowed to look at those during the contest, so I'm not going to look at those.