A Rapid-I analyst asked that I test RapidMiner on EC2 with a more powerful instance than I had used previously. Rapid-I is the company that manages RapidMiner, although it's open-source, and provides consulting and training on implementation, so I was happy to oblige.

Here are the instance types Amazon offers:

    Standard Instances

    $0.10 - Small Instance (Default)
      1.7 GB of memory, 1 EC2 Compute Unit (1 virtual core with 1 EC2 Compute Unit), 160 GB of instance storage, 32-bit platform

    $0.40 - Large Instance

      7.5 GB of memory, 4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each), 850 GB of instance storage, 64-bit platform

    $0.80 - Extra Large Instance

      15 GB of memory, 8 EC2 Compute Units (4 virtual cores with 2 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform
    High-CPU Instances
    $0.20 - High-CPU Medium Instance
      1.7 GB of memory, 5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute
      Units each), 350 GB of instance storage, 32-bit platform

    $0.80 - High-CPU Extra Large Instance
      7 GB of memory, 20 EC2 Compute Units (8 virtual cores with 2.5 EC2 Compute Units each), 1690 GB of instance storage, 64-bit platform
The one I tested last time is highlighted in blue, the faster one I tested this time in red. I am constrained to 32-bit because the publicly available Ubuntu image with remote desktop capabilities is only available for 32-bit.

Running the same test I did last time, the results were improved by a factor of about 3x:RapidMiner running on my laptop on Windows is on top with the chrome UI. EC2's Ubuntu remote desktop is behind with an orange UI. It's about 7400ms vs 17000ms. This is an improvement, but my ultra-portable laptop only has a 1.06Ghz Mobile CPU and 1Gb of memory.

I also ran an internet speed test out of curiosity:
20MB/sec downloads and 10MB/sec uploads - even better than when I was running off an optical fiber in Korea.

Getting EC2 up and running again was not nearly as easy as I expected. If the service was more user-friendly I would seriously consider using it regularly, but for someone who is not very proficient with Unix it requires a lot of patience. The bill for my previous usage was much lower than I expected, only $0.75. That included a lot of overhead time for setup and to figure out how to use it.

I'd be interested to hear if anyone has tested a 64-bit instance, which AMI you used, and how much the extra memory improved performance.

5 comments:

Eric said...

Max - Can you send me the files needed to replicate this test? I just purchased a new HP workstation and would be interested in comparing the two machines.

Thanks,
Eric

Max Dama said...

Eric,

Here is the dataset and here is the experiment. Be sure to change the file destination in the ExcelExampleSource operator.

I'd like to know your results and the machine's basic specs too.

Regards,
Max

Max Dama said...

Received an email from Amazon regarding Windows Servers:

Dear Amazon Web Services Developer,

We are excited to let you know that Amazon Elastic Compute Cloud (Amazon EC2) will offer you the ability to run Microsoft Windows Server or Microsoft SQL Server starting later this Fall. Today, you can choose from a variety of Unix-based operating systems, and soon you will be able to configure your instances to run the Windows Server operating system. In addition, you will be able to use SQL Server as another option within Amazon EC2 for running relational databases.

Amazon EC2 running Windows Server or SQL Server provides an ideal environment for deploying ASP.NET web sites, high performance computing clusters, media transcoding solutions, and many other Windows-based applications. By choosing Amazon EC2 as the deployment environment for your Windows-based applications, you will be able to take advantage of Amazons proven scalability and reliability, as well as the cost-effective, pay-as-you-go pricing model offered by Amazon Web Services.

Our goal is to support any and all of the programming models, operating systems and database servers that you need for building applications on our cloud computing platform. The ability to run a Windows environment within Amazon EC2 has been one of our most requested features, and we are excited to be able to provide this capability. We are currently operating a private beta of Amazon EC2 running Windows Server and SQL Server. Please go to aws.amazon.com/windows if you are interested in being notified later this Fall when the offering is released broadly.

Sincerely,

The Amazon Web Services Team

Ernie Chan said...

Max,
I tested Matlab2IB API. It is pretty easy to use, with few bugs so far, and excellent customer support. They will send you .m files if you need common functions that are lacking.

BTW: I also added link to your blog to epchan.blogspot.com

Do you find that EC2 is the best solution as a trading server? Do you recommend any other managed server or cloud computing platforms that have fast enough internet connection for high frequency trading?

Thanks,
Ernie

Max Dama said...

Ernie,

I've replied via email

Max