Jose Sandoval Google
 Resume     Book     Software     Drawings     Home     Subscribe to RSS feed Search Web Search

Difference between a REST call and a non-REST call
Friday, February 27, 2009

The fundamental difference between modern web application development and legacy web application development is how we think of the actions taken on chunks of data. Modern development is rooted in the concept of nouns; legacy development is rooted in the concept of verbs. More....

7:29 PM | 0 comment(s) |

What's a Software Architecture?
Thursday, February 26, 2009

A software architecture deals with (taken from Beautiful Architecture):
    What functionality does the product offer to its users?

    What changes may be needed in the software in the future, and what changes are unlikely and need not be especially easy to make in the future?

    What will the performance of the product be?

    How many users will use the system simultaneously? How much data will the system need to store for its users?

    What interactions will the system have with other systems in the ecosystem in which it will be deployed?

    How is the task of writing the software organized into work assignments (modules), particularly modules that can be developed independently and that suit each other's needs precisely and easily?

    How can the software be built as a set of components that can be independently implemented and verified? What components should be reused from other products and which should be acquired from external suppliers?

    If the product will exist in several variations, how can it be developed as a product line, taking advantage of the commonality among the versions, and what are the steps by which the products in the product line can be developed (Weiss and Lai 1999)? What investment should be made in creating a software product line? What is the expected return from creating the options to develop different members of the product line?

    In particular, is it possible to develop the smallest minimally useful product first and then develop additional members of the product line by adding (and subtracting) components without having to change the code that was written previously?

    If the product requires authorization for its use or must restrict access to data, how can security of data be ensured? How can "denial of service" and other attacks be withstood?
It's a mouthful, isn't it?

5:41 PM | 0 comment(s) |

Google's GMail and encrypted ZIP files

I use my gmail account as a storage device--7GB is a lot of free disk space. I zip my files and email them to me. From time to time, I have sensitive data that I don't want google to look at (all right, google's programs). Because of this, I encrypt the zip archive with the strongest key possible (right now, it's a 256-bit key; this thing is virtually unbreakable). As it happens, google doesn't like if you do this, because its virus scanner can't read the file and it immediately flags it as a virus. If you've done this you know your messages will bounce.

What are you to do, if you really want to send that encrypted email to someone (or yourself)?

Simple: just change the extension of the file to something else. I change it to '.txt' and send the file. Google's virus scanner doesn't flag it and it sends fine.

Note that you can do the same for executable files, as google will not let you email anything that is a program (a file with an extension of .exe, .com, or .bat). So, change the extension from '.exe' to '.txt' and that's it.

Of course, when you download the file from gmail you have to remember to change the extension of the file back to its original.

This trick defies the purpose of the virus scanner: a stream of malicious binary data is floating around just waiting to be executed. At some point, google will catch up to our tricks and will make its virus scanner software smarter than what it is. Until then, happy emailing encrypted (or executable) files.

3:24 PM | 3 comment(s) |

WTF, Digsby? I'm just trying to install you and nothing else...
Monday, February 23, 2009

I started using Digsby a few months ago, because I have online contacts that use different IM technologies--Digsby brings them all under one account. However, installing the actual IM software is a pain now. Not because it's hard to run the installer, but because you need to pay attention to all the crap it tries to install without you looking. In total, there are 8 screens in the installer and each one requires you to read something and then click something. Check all the Accept/Decline screens you have to go through:

In this screen you have 11 choices.

In this screen you have 9 choices.

In this screen you have 13 choices (not counting the scroll bar).

In this screen you have 11 choices.

In this screen you have 11 choices.

In this screen you have 12 choices.

In this screen you have 8 choices.

And finally 3 choices.

The number of screens is excessive. What's more, the amount of information in each screen is too much and users have too many choices. For example, throughout the install process I had 78 possible clicks (not counting the scroll bars). Imagine if you read everything available in them; you would need a half day to install one application. Hence, the reason we users have bloated computers: we don't read everything and just click the Accept button at every turn in the hope to just start using whatever software we are installing.

Why are so many useless tools trying to get into my computer? It's the side effect of a business model that generates no cash: you need to accept sponsorship or marketing money to survive. Note that Digsby's makers are not hiding the fact and have a 'Why is this Free?' link in the first screen of the installer. This is their reason:
    Why is this free?

    After clicking "Accept" you will be offered additional useful, quality software provided by our reputable partners. Your support of these software offers allows us to provide you with Free access to our software. All offers we present are 100% optional.
The usefulness of the offered apps is debatable, but I will neither judge nor install them.

Going through this experience today raised an interesting break in my day, as I'm currently designing (thinking right now) an installer for a desktop Java application I developed for a client. The app is being used in the US and Canada, but it's not the easiest to install: there is too much manual labour involved.

I'm currently waiting for the go-ahead from the client to implement the ideal one click installer. We will see how many screens I need, as I have to create background installers for the latest Java run time environment, MySQL, and the actual binaries for the application. I hope I can stop at 3; however, we don't need to install other bloatware to justify the cost of the app.

What does the ideal installer look like? I'm glad you asked:

5:02 PM | 0 comment(s) |

Google App Engine and Python Learning from a Java Developer's Point of View

I've had a Google App Engine account for a while now, but I had never created anything of substance until this weekend. I have always liked Google's concept of forgetting the hard and expensive parts of deploying applications on the web and concentrate on the actual business rules. Not knowing if the promise would apply to me, I started to seriously look at this architecture a couple of days ago.

One of the reasons I hadn't coded an application in this architecture is because I didn't know Python--Python being the only supported programming language so far. I'm no stranger to learning new languages and I actually like trying new things, so being sick (a head cold) and unable to sleep, I actually decided to learn Python to create a real application. This was not such a good idea, as I didn't sleep at all last night, because I was really exited about my new toy.

I'm writing this entry in case you are in the same position I was: you are a developer of large scale system and would like to try something new--a way to recharge the batteries, if you will. In this post, I can only answer 2 questions: do I like python? And what is it like for a developer with some experience with other OO languages to come fresh into this scripting world?

First, I like Python. I mean, I don't love it but I don't dislike it, as it's just another implementation technology--I don't really care what I code in: picking up the syntax of a new language is not that hard; finding interesting problems to solve is the hard part.

Having being in the trenches for 2 days straight, I think it's a nice OO language, but I'm used to "real" programming languages with semicolons, curly braces, and strongly typed variables. If you are old school, Python can seem, at first, a bit toysy and unsophisticated (it isn't, of course).

My experience with the language was minimal, and I had only skimmed through a couple of books at the local bookshop. I had also read Google's intro for working with their architecture. It took me a couple of hours to start actually programming, not just copying and pasting code from one one window to the next. At the end of the first day (6 hours or so), I got the hang of it, and the language has grown on me.

I have to admit, however, that I like it because of the applicability to the Google App Engine environment. I don't know of many large Python projects, but maintenance of a large scale system is probably hard. It's true that you can layer your development logic, but newer web applications don't require 50+ engineers to complete, and things can get messy fast. Mind you that we can write messy code in any language. Nevertheless, I like to try to fake a semblance of control. Well structured code is something I take to heart, as it is a pleasure to read and maintain (even those spaces and tab sizes matter).

Which brings me to Python's coding style. In the Python world variableNames are not what I'm used to. Method naming conventions follow a very SQL like naming structure with underscores_every_where. I could just use the style I'm used to, but I can't--like I said, I like to follow proper coding styles for the language I'm using. Because of this, before I did any coding, I read Python's coding style guidelines.

Not everything about the language is to my liking. One of my biggest pet peeves is the interpreter's strictness of indentation to determine scope of variables and statements. For example, if one line of code is not indented properly, the program where this line comes from will never run. It seems like a silly thing to penalize developers for a missing space. I guess it's one of the reason I like the C/C++/Java/C# type of languages: the braces and semicolons take care of scoping.

So that you don't think I'm complaining because I'm being a big cry-baby, this is an actual class in my application:

class Job(webapp.RequestHandler):
def get(self):
resume_key = self.request.get('resume_key')
resume = db.get(resume_key)

template_values = {'loggedIn': False,
'displayResume': True,
'resume': resume}

path = os.path.join(os.path.dirname(__file__), 'resume.html')
self.response.out.write(template.render(path, template_values))
You would likely spend a lot of time to find the bug in this code, so I'll just tell you. Note the line where template_values is defined: I have added an extra space to make my point. This code will not run. Fortunately, you can pick up those bugs if you use an IDE like Eclipse and a development plug-in like Pydev. (To set up your GAE environment to run within Eclipse, follow these instructions: Google App Engine & eclipse (PyDev).)

As usual, the going started slow, but after two days I have seen the quantity of usable and working lines of codes increase considerably, and continues to increase. Specially, when I can now shorten lines of code. For example, the above code can be written in 1 line--but I wouldn't recommend it:
class Job(webapp.RequestHandler):
def get(self):
{'loggedIn': False,
'displayResume': True,
'resume': db.get(self.request.get('resume_key'))}))
Which version did I use in my program? The first one. I'm just making a point that you can do crazy things once you become familiar with the syntax.

Did I buy any Python books for my programming adventure? I didn't. I looked for books; however, I didn't like any of the currently published works: I found them boring. I would have not pick up the language, as quickly as I did, by running silly string splitting examples all day, or counting the keys in a dictionary ADT until sunrise.

If you want to learn Python, you should consider the Google App Engine architecture to play with. It's a complete development platform and you get immediate rewards: you store things into a non-relational database by calling a put() method from an inherited Model class; you parse HTTP requests by getting the name/value pairs from the input stream; and you send emails by calling the supplied mail library. I did all of these in my Resume Builder application, without knowing too much about Python.

As an aside, I've come to know that learning programming can be tedious, and borderlines on obsessive compulsiveness. It's hard to explain, but if you haven't sat in front of a computer for 12+ hours straight writing code, you will not understand the thrill of seeing an application run in front of you. I only mention this because I have never been able to learn coding by reading a book--and why I think the Python books I considered were boring. Mind you that this is my experience, but you could learn anything with very little supervision and just by reading. I'm just saying that I got more out of programming this application and learned a great deal because of the Google App Engine architecture. I don't think I could have got the same gentle introduction of the language from any of the typical Python books. In other words, I learned Python and I am now a believer that Google's architecture is a viable computing platform for real web application development.

About the Resume Builder: I wanted to play with the latest UI recommendations that people are talking and writing about. I recommend reading the book on the left, if you have any interest in user interfaces for the web. The book is good, but there are no implementation examples in it--to be fair, the book is not an example driven read and the authors make that clear in the introduction. For code samples, you will have to look around and code things yourself. Moreover, you will have to use scriptaculous or jQuery to implement any of the recommendations; however, you will find that most of the hard work is already done for you with these open sourced libraries.

What does my app do? It's all about inline editing, discoverable elements, and Ajax calls galore. Oh, I also added Google's AdSense to see if it generates any money. Finally, this app does one thing only: you can create your resume in one step, and you can then send a link of it for people to see it. For example, this is my resume.

12:11 AM | 3 comment(s) |

Facebook's security hole?
Thursday, February 12, 2009

Serving dynamic content on the web is tricky. In the end, all the content needs to be stored somewhere and needs to be rendered as web pages. In the case of Facebook, there are gigabytes of data stored in a large number of servers. Of course, all of the data is secure...Or is it?

When you create a Facebook profile, you are "guaranteed" a secured area for your content. Note that I put guaranteed in quotes, because not all of your content is password protected--at least, it isn't right now. For example, this picture of one of my drawings is accessible without you having to enter your Facebook id and password.

This is obviously a problem or an oversight. More important, if you play around with the URL, you can get different images. Check this out:

I don't know this guy, and it's likely that he doesn't want us to be looking at his "private" picture, but here it is available without a Facebook id and password.

This is what the images look like right now: me; dude I don't know.

Is this really a problem? The answers can be left to interpretation, and some could say yes and some could say no. I'm on the side of this actually being a security hole.

The problem is that images in Facebook pages live locally in web servers that are accessed as follows:

<img src=/mydir/myimage.jpg>

Nothing surprising with this code: we have a directory and an image on some computer.

This code is how secure Facebook pages load images. Mind you that they have all those crazy numbers in the directory and image names, but it's the same thing: a link to a static file somewhere.

So the question is, then, what is secured and what is not secured in a Facebook session?

In the case of a Facebook page, the content is secure but the image is not. You could argue that the image is secure because the HTML that renders the page is password protected and there is no way for anyone to guess the location of the image. This is nonsense, of course, because if I copy the link of the image and provide it to anyone without a Facebook account he or she can see the image. This is hardly secure.

Schematically, this is what a page currently looks like:

If this is a problem, is there a solution? Yes, actually. What Facebook could do is to stream the images via some proxy process (a streaming program). This way, users don't know where the image is and the image would only be displayed if users are authenticated. A high level view of the solution looks like:

The issue with this solution is that this is CPU intensive, because we are adding one more layer on top of HTTP to serve the image. HTTP servers are optimized to serve static content, but once you add something in the middle of the browser and the server, this something will require CPU cycles and memory. With thousands of concurrent users, this is a big problem. Not a new problem, but a problem nonetheless.

Another feasible solution for Facebook is to modify the HTTP server itself to handle pages serving secure images that are within the context of a session. This is not hard, but it adds more complexity to the overall architecture. I'd probably consider this as option B, as creating a streaming process a trivial programming exercise.

Finally, the last solution I can think of involves configuring web servers so that any requests that are not originated from will be considered invalid.

Regardless of what you and I think about this, the "hole" needs to be fixed. It strikes directly to the core software engineering concept of abstraction and hidden implementation.

5:31 PM | 0 comment(s) |

The Agony of the End
Monday, February 09, 2009

From the beginning, I knew it would end.
Yet I endured through the highs and lows.

At times I laughed; at times I cried;
but I was always faithful through it all.

The journey brought me comfort;
the unyielding loyalty gave me peace.

I could just shut down and avoid the pain.
But I can't. I mustn't.

I do it all the time: it begins with a flirty glance,
and I let destiny takes its course.
I don't believe in destiny,
but nothing is left to chance.

The anxiety is unbearable now.
So much coming to an end, and I know it ends.
But tomorrow, I can do it all over again.

(My ode the last page of a book.)

11:37 PM | 0 comment(s) |

This page is powered by Blogger. Isn't yours?

© Jose Sandoval 2004-2009