rails, PostgreSQL and the native uuid type
UUID have the very handy property that they are uniqe and there are quite many of them for you to use. Also they are difficult to guess and knowing the UUID of one object, it's very hard to guess a valid UUID of another object.
This makes UUIDs perfect for identifying things in web applications:
- Even if you shard across multiple machines, each machine can independently generate primary keys without (realistic) fear of overlapping.
- You can generate them without using any kind of locks.
- Sometimes, you have to expose such keys to the user. If possible, you will of course do authorization checks, but it still makes sense not allowing users know about neighboring keysThis gets even more important when you are not able to do authorization keys because the resource you are referring to is public (like a mail alias) but it should still not possible to know other items if you know one.
Knowing that UUIDs are a good thing, you might want to use them in your application (or you just have to in the last case above).
There are multiple recipes out there that show how to do it in a rails application (this one for example).
All of these recipes store UUIDs as varchar's in your database. In general, that's fine and also the only thing you can do as most databases don't have a native data type for UUIDs.
PostgreSQL on the other hand indeed has a native 128 bit integer type to store UUID.
This is more space efficient than storing the UUID in string form (288 bit) and it might be a tad bit faster when doing comparison operations on the database as integer operations (even if they are this big) require a constant amount of operations whereas comparing two string UUIDs is a string comparison which is dependent on the string size and size of the matching parts.
So maybe for the (minuscule) speed increase or for the purpose of correct semantics or just for interoperability with other applications, you might want to use native PostgreSQL UUIDs from your Rails (or other, but without the abstraction of a "Migration", just using UUID is trivial) applications.
This already works quite nicely if you generate the columns as strings in your migrations and then manually send an alter table (whenever you restore the schema from scratch).
But if you want to create the column with the correct type directly from the migration and you want the column to be created correctly when using rake db:schema:load, then you need a bit of additional magic, especially if you want to still support other databases.
In my case, I was using PostgreSQL in production (what else?), but on my local machine, for the purpose of getting started quickly, I wanted to still be able to use SQLite for development.
In the end, everything boils down to monkey patching ActiveRecord::ConnectionAdapters::*Adapters and PostgreSQLColumn of the same module. So here's what I've addded to config/initializers/uuuid_support.rb (Rails 3.0.*):
module ActiveRecord module ConnectionAdapters SQLiteAdapter.class_eval do def native_database_types_with_uuid_support a = native_database_types_without_uuid_support a[:uuid] = {:name => 'varchar', :limit => 36} return a end alias_method_chain :native_database_types, :uuid_support end if ActiveRecord::Base.connection.adapter_name == 'SQLite' if ActiveRecord::Base.connection.adapter_name == 'PostgreSQL' PostgreSQLAdapter.class_eval do def native_database_types_with_uuid_support a = native_database_types_without_uuid_support a[:uuid] = {:name => 'uuid'} return a end alias_method_chain :native_database_types, :uuid_support end PostgreSQLColumn.class_eval do def simplified_type_with_uuid_support(field_type) if field_type == 'uuid' :uuid else simplified_type_without_uuid_support(field_type) end end alias_method_chain :simplified_type, :uuid_support end end end end |
In your migrations you can then use the :uuid type. In my sample case, this was it:
class AddGuuidToSites < ActiveRecord::Migration def self.up add_column :sites, :guuid, :uuid add_index :sites, :guuid end def self.down remove_column :sites, :guuid end end |
Maybe with a bit better Ruby knowledge than I have, it should be possible to just monkey-patch the parent AbstractAdaper while still calling the method of the current subclass. This would not require a separate patch for all adapters in use.
For my case which was just support for SQLite and PostgreSQL, the above initializer was fine though.
How I back up gmail
There was a discussion on HackerNews about Gmail having lost the email in some accounts. One sentiment in the comments was clear:
It's totally the users problem if they don't back up their cloud based email.
Personally, I think I would have to agree:
Google is a provider like every other ISP or basically any other service too. There's no reason to believe that your data is more save on Google than it is any where else. Now granted, they are not exactly known for losing data, but there's other things that can happen.
Like your account being closed because whatever automated system believed your usage patterns were consistent with those of a spammer.
So the question is: What would happen if your Google account wasn't reachable at some point in the future?
For my company (using commercial Google Apps accounts), I would start up that IMAP server which serves all mail ever sent to and from Gmail. People would use the already existing webmail client or their traditional IMAP clients. They would lose some productivity, but no single byte of data.
This was my condition for migrating email over to Google. I needed to have a back up copy of that data. Otherwise, I would not have agreed to switch to a cloud based provider.
The process is completely automated too. There's not even a backup script running somewhere. Heck, not even the Google Account passwords have to be stored anywhere for this to work.
So. How does it work then?
Before you read on, here are the drawbacks of the solution:
- I'm a die-hard Exim fan (long story. It served me very well once - up to saving-my-ass level of well), so the configuration I'm outlining here is for Exim as the mail relay.
- Also, this only works with paid Google accounts. You can get somewhere using the free ones, but you don't get the full solution (i.e. having a backup of all sent email)
- This requires you to have full control over the MX machine(s) of your domain.
If you can live with this, here's how you do it:
First, you set up your Google domain as normal. Add all the users you want and do everything else just as you would do it in a traditional set up.
Next, we'll have to configure Google Maps for two-legged OAuth access to our accounts. I've written about this before. We are doing this so we don't need to know our users passwords. Also, we need to enable the provisioning API to get access to the list of users and groups.
Next, our mail relay will have to know about what users (and groups) are listed in our Google account. Here's what I quickly hacked together in Python (my first Python script ever - be polite while flaming) using the GData library:
import gdata.apps.service consumer_key = 'yourdomain.com' consumer_secret = '2-legged-consumer-secret' #see above sig_method = gdata.auth.OAuthSignatureMethod.HMAC_SHA1 service = gdata.apps.service.AppsService(domain=consumer_key) service.SetOAuthInputParameters(sig_method, consumer_key,\ consumer_secret=consumer_secret, two_legged_oauth=True) res = service.RetrieveAllUsers() for entry in res.entry: print entry.login.user_name import gdata.apps.groups.service service = gdata.apps.groups.service.GroupsService(domain=consumer_key) service.SetOAuthInputParameters(sig_method, consumer_key,\ consumer_secret=consumer_secret, two_legged_oauth=True) res = service.RetrieveAllGroups() for entry in res: print entry['groupName'] |
Place this script somewhere on your mail relay and run it in a cron job. In my case, I'm having its output redirected to /etc/exim4/gmail_accounts. The script will emit one user (and group) name per line.
Next, we'll deal with incoming email:
In the Exim configuration of your mail relay, add the following routers:
yourdomain_gmail_users:
driver = accept
domains = yourdomain.com
local_parts = lsearch;/etc/exim4/gmail_accounts
transport_home_directory = /var/mail/yourdomain/${lc:$local_part}
router_home_directory = /var/mail/yourdomain/${lc:$local_part}
transport = gmail_local_delivery
unseen
yourdomain_gmail_remote:
driver = accept
domains = yourdomain.com
local_parts = lsearch;/etc/exim4/gmail_accounts
transport = gmail_t |
yourdomain_gmail_users is what creates the local copy. It accepts all mail sent to yourdomain.com, if the local part (the stuff in front of the @) is listed in that gmail_accounts file. Then it sets up some paths for the local transport (see below) and marks the mail as unseen so the next router gets a chance too.
Which is yourdomain_gmail_remote. This one is again checking domain and the local part and if they match, it's just delegating to the gmail_t remote transport (which will then send the email to Google).
The transports look like this:
gmail_t:
driver = smtp
hosts = aspmx.l.google.com:alt1.aspmx.l.google.com:\
alt2.aspmx.l.google.com:aspmx5.googlemail.com:\
aspmx2.googlemail.com:aspmx3.googlemail.com:\
aspmx4.googlemail.com
gethostbyname
gmail_local_delivery:
driver = appendfile
check_string =
delivery_date_add
envelope_to_add
group=mail
maildir_format
directory = MAILDIR/yourdomain/${lc:$local_part}
maildir_tag = ,S=$message_size
message_prefix =
message_suffix =
return_path_add
user = Debian-exim
create_file = anywhere
create_directory |
the gmail_t transport is simple. The local one you might have to patch up users and groups plus the location where you what to write the mail to.
Now we are ready to reconfigure Google as this is all that's needed to get a copy of every inbound mail into a local maildir on the mail relay.
Here's what you do:
- You change the MX of your domain to point to this relay of yours
The next two steps are the reason you need a paid account: These controls are not available for the free accounts:
- In your Google Administration panel, you visit the Email settings and configure the outbound gateway. Set it to your relay.
- Then you configure your inbound gateway and set it to your relay too (and to your backup MX if you have one).
This screenshot will help you:
All email sent to your MX (over the gmail_t transport we have configured above) will now be accepted by gmail.
Also, Gmail will now send all outgoing Email to your relay which needs to be configured to accept (and relay) email from Google. This pretty much depends on your otherwise existing Exim configuration, but here's what I added (which will work with the default ACL):
hostlist google_relays = 216.239.32.0/19:64.233.160.0/19:66.249.80.0/20:\
72.14.192.0/18:209.85.128.0/17:66.102.0.0/20:\
74.125.0.0/16:64.18.0.0/20:207.126.144.0/20
hostlist relay_from_hosts = 127.0.0.1:+google_relays |
And lastly, the tricky part: Storing a copy of all mail that is being sent through Gmail (we are already correctly sending the mail. What we want is a copy):
Here is the exim router we need:
gmail_outgoing:
driver = accept
condition = "${if and{\
{ eq{$sender_address_domain}{yourdomain.com} }\
{=={${lookup{$sender_address_local_part}lsearch{/etc/exim4/gmail_accounts}{1}}}{1}}} {1}{0}}"
transport = store_outgoing_copy
unseen |
(did I mention that I severely dislike RPN?)
and here's the transport:
store_outgoing_copy:
driver = appendfile
check_string =
delivery_date_add
envelope_to_add
group=mail
maildir_format
directory = MAILDIR/yourdomain/${lc:$sender_address_local_part}/.Sent/
maildir_tag = ,S=$message_size
message_prefix =
message_suffix =
return_path_add
user = Debian-exim
create_file = anywhere
create_directory |
The maildir I've chosen is the correct one if the IMAP-server you want to use is Courier IMAPd. Other servers use different methods.
One little thing: When you CC or BCC other people in your domain, Google will send out multiple copies of the same message. This will yield some message duplication in the sent directory (one per recipient), but as they say: Better backup too much than too little.
Now if something happens to your google account, just start up an IMAP server and have it serve mail from these maildir directories.
And remember to back them up too, but you can just use rsync or rsnapshot or whatever other technology you might have in use. They are just directories containing one file per email.
Find relation sizes in PostgreSQL
Like so many times before, today I was yet again in the situation where I wanted to know which tables/indexes take the most disk space in a particular PostgreSQL database.
My usual procedure in this case was to \dt+ in psql and scan the sizes by eye (this being on my development machine, trying to find out the biggest tables I could clean out to make room).
But once you've done that a few times and considering that \dt+ does nothing but query some PostgreSQL internal tables, I thought that I want this solved in an easier way that also is less error prone. In the end I just wanted the output of \dt+ sorted by size.
The lead to some digging in the source code of psql itself (src/bin/psql) where I quickly found the function that builds the query (listTables in describe.c), so from now on, this is what I'm using when I need to get an overview over all relation sizes ordered by size in descending order:
select n.nspname as "Schema", c.relname as "Name", case c.relkind when 'r' then 'table' when 'v' then 'view' when 'i' then 'index' when 'S' then 'sequence' when 's' then 'special' end as "Type", pg_catalog.pg_get_userbyid(c.relowner) as "Owner", pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid)) as "Size" from pg_catalog.pg_class c left join pg_catalog.pg_namespace n on n.oid = c.relnamespace where c.relkind IN ('r', 'v', 'i') order by pg_catalog.pg_relation_size(c.oid) desc; |
Of course I could have come up with this without source code digging, but honestly, I didn't know about relkind s, about pg_size_pretty and pg_relation_size (I would have thought that one to be stored in some system view), so figuring all of this out would have taken much more time than just reading the source code.
Now it's here so I remember it next time I need it.
How to kill IE performance
While working on my day job, we are often dealing with huge data tables in HTML augmented with some JavaScript to do calculations with that data.
Think huge shopping cart: You change the quantity of a line item and the line total as well as the order total will change.
This leads to the same data (line items) having three representations:
- The model on the server
- The HTML UI that is shown to the user
- The model that's seen by JavaScript to do the calculations on the client side (and then updating the UI)
You might think that the JavaScript running in the browser would somehow be able to work with the data from 2) so that the third model wouldn't be needed, but due to various localization issues (think number formatting) and data that's not displayed but affects the calculations, that's not possible.
So the question is: Considering we have some HTML templating language to build 2), how do we get to 3).
Back in 2004 when I initially designed that system (using AJAX before it was widely called AJAX even), I hadn't seen Crockford's lectures yet, so I still lived in the "JS sucks" world, where I've done something like this
<!-- lots of TRs --> <tr> <td>Column 1 <script>addSet(1234 /*prodid*/, 1 /*quantity*/, 10 /*price*/, /* and, later, more, stuff, so, really, ugly */)</script></td> <td>Column 2</td> <td>Column 3</td> </tr> <!-- lots of TRs --> |
(Yeah - as I said: 2004. No object literals, global functions. We had a lot to learn back then, but so did you, so don't be too angry at me - we improved)
Obviously, this doesn't scale: As the line items got more complicated, that parameter list grew and grew. The HTML code got uglier and uglier and of course, cluttering the window object is a big no-no too. So we went ahead and built a beautiful design:
<!-- lots of TRs --> <tr class="lineitem" data-ps-lineitem='{"prodid": 1234, "quantity": 1, "price": 10, "foo": "bar", "blah": "blah"}'> <td>Column 1</td> <td>Column 2</td> <td>Column 3</td> </tr> <!-- lots of TRs --> |
The first iteration was then parsing that JSON every time we needed to access any of the associated data (and serializing again whenever it changed). Of course this didn't go that well performance-wise, so we began caching and did something like this (using jQuery):
$(function(){ $('.lineitem').each(function(){ this.ps_data = $.parseJSON($(this).attr('data-ps-lineitem')); }); }); |
Now each DOM element representing one of these
This design is reasonably clean (still not as DRY as the initial attempt which had the data only in that JSON string) while still providing enough performance.
Until you begin to amass datasets. That is.
Well. Until you do so and expect this to work in IE.
800 rows like this made IE lock up its UI thread for 40 seconds.
So more optimization was in order.
First,
$('.lineitem') |
will kill IE. Remember: IE (still) doesn't have getElementsByClassName, so in IE, jQuery has to iterate the whole DOM and check whether each elements class attribute contains "lineitem". Considering that IE's DOM isn't really fast to start with, this is a HUGE no-no.
So.
$('tr.lineitem') |
Nope. Nearly as bad considering there are still at least 800 tr's to iterate over.
$('#whatever tr.lineitem') |
Would help if it weren't 800 tr's that match. Using dynaTrace AJAX (highly recommended tool, by the way) we found out that just selecting the elements alone (without the iteration) took more than 10 seconds.
So the general take-away is: Selecting lots of elements in IE is painfully slow. Don't do that.
But back to our little problem here. Unserializing that JSON at DOM ready time is not feasible in IE, because no matter what we do to that selector, once there are enough elements to handle, it's just going to be slow.
Now by chunking up the amount of work to do and using setTimeout() to launch various deserialization jobs we could fix the locking up, but the total run time before all data is deserialized will still be the same (or slightly worse).
So what we have done in 2004, even though it was ugly, was way more feasible in IE.
Which is why we went back to the initial design with some improvements:
<!-- lots of TRs --> <tr class="lineitem"> <td>Column 1 <script>PopScan.LineItems.add({"prodid": 1234, "quantity": 1, "price": 10, "foo": "bar", "blah": "blah"});</script></td> <td>Column 2</td> <td>Column 3</td> </tr> <!-- lots of TRs --> |
*phew* crisis averted.
Loading time went back to where it was in the 2004 design. It was still bad though. With those 800 rows, IE was still taking more than 10 seconds for the rendering task. dynaTrace revealed that this time, the time was apparently spent rendering.
The initial feeling was that there's not much to do at that point.
Until we began suspecting the script tags.
Doing this:
<!-- lots of TRs --> <tr class="lineitem"> <td>Column 1</td> <td>Column 2</td> <td>Column 3</td> </tr> <!-- lots of TRs --> |
The page loaded instantly.
Doing this
<!-- lots of TRs --> <tr class="lineitem"> <td>Column 1 <script>1===1;</script></td> <td>Column 2</td> <td>Column 3</td> </tr> <!-- lots of TRs --> |
it took 10 seconds again.
Considering that IE's JavaScript engine runs as a COM component, this isn't actually that surprising: Whenever IE hits a script tag, it stops whatever it's doing, sends that script over to the COM component (first doing all the marshaling of the data), waits for that to execute, marshals the result back (depending on where the DOM lives and whether the script accesses it, possibly crossing that COM boundary many, many times in between) and then finally resumes page loading.
It has to wait for each script because, potentially, that JavaScript could call document.open() / document.write() at which point the document could completely change.
So the final solution was to loop through the server-side model twice and do something like this:
<!-- lots of TRs --> <tr class="lineitem"> <td>Column 1 </td> <td>Column 2</td> <td>Column 3</td> </tr> <!-- lots of TRs --> </table> <script> PopScan.LineItems.add({prodid: 1234, quantity: 1, price: 10, foo: "bar", blah: "blah"}); // 800 more of these </script> |
Problem solved. Not too ugly design. Certainly no 2004 design any more.
And in closing, let me give you a couple of things you can do if you want to bring the performance of IE down to its knees:
- Use broad jQuery selectors.
$('.someclass')will cause jQuery to loop through all elements on the page. - Even if you try not to be broad, you can still kill performance:
$('div.someclass'). The most help jQuery can expect from IE is getElementsByTagName, so while it's better than iterating all elements, it's still going over all div's on your page. Once it's more than 200, the performance extremely quickly falls down (probably doing some O(n^2) thing somehwere). - Use a lot of <script>-tags. Every one of these will force IE to marshal data to the scripting engine COM component and to wait for the result.
Next time, we'll have a look at how to use jQuery's delegate() to handle common cases with huge selectors.
Nokia N900 and iPhone headsets?
For a geek like me, the Nokia N900 is paradise on earth: It's full Debian Linux in your bag. It has the best IM integration I have ever seen on any mobile device. It has the best VoIP (Skype, SIP) integration I have ever seen on any mobile device and it has one of the coolest multitasking implementations I've seen on any mobile device (the card-based task/application switching is fantastic).
Unfortunately, there's one thing that prevents me from using it (or many other phones) to replace my iPhone: While the whole world agreed on one way to wire a microphone/headphone combination, Apple thought it wise to do it another way, which leads to Apple compatible headsets not working with the N900.
By not working I don't just mean "no microphone" or even "no sound". No. I mean "deafening buzzing on both the left and right channel and headset still not being recognized in the software".
The problem is that I already own iPhone compatible headsets and that it's way easier to get good iPhone compatible ones around here. I'm constantly listening to audio on my phone (Podcast, Audiobooks). Having to grab the phone out of my bag and unplugging the headphones whenever it rings is inacceptable to me, so I need to have a microphone with my headphones.
Just now though, I found a small adapter which promises to solve that problem, proving once again, that there's nothing that's not being sold on the internet.
I ordered one (thankfully one of the international shipping options was less than the adapter itself - something I'm not used to with the smaller stores), so we'll see how that goes. If it means that I can use a N900 as my one and only device, I'll be a very happy person indeed.
Do spammers find pleasure in destroying fun stuff?
Recently, while reading through the log file of the mail relay used by tempalias, I noticed a disturbing trend: Apparently, SPAM was being sent through tempalias.
I've seen various behaviours. One was to strangely create an alias per second to the same target and then delivering email there.
While I completely fail to understand this scheme, the other one was even more disturbing: Bots were registering {max-usage: 1, days: null} aliases and then sending one mail to them - probably to get around RBL checks they'd hit when sending SPAM directly.
Aside of the fact that I do not want to be helping spammers, this also posed a technical issue: node.js head which I was running back when I developed the service tended to leak memory at times forcing me to restart the service here and then.
Now the additional huge load created by the bots forced me to do that way more often than I wanted to. Of course, the old code didn't run on current node any more.
Hence I had to take tempalias down for maintenance.
A quick look at my commits on GitHub will show you what I have done:
- the tempalias SMTP daemon now does RBL checks and immediately disconnects if the connected host is listed.
- the tempalias HTTP daemon also does RBL checks on alias creation, but it doesn't check the various DUL lists as the most likely alias creators are most certainly listed in a DUL
- Per IP, aliases can only be generated every 30 seconds.
This should be some help. In addition, right now, the mail relay is configured to skip sender-checks and sa-exim scans (Spam Assassin on SMTP time as to reject spam before even accepting it into the system) for hosts where relaying is allowed. I intend to change that so that sa-exim and sender verify is done regardless if the connecting host is the tempalias proxy.
Looking at the mail log, I've seen the spam count drop to near-zero, so I'm happy, but I know that this is just a temporary victory. Spammers will find ways around the current protection and I'll have to think of something else (I do have some options, but I don't want to pre-announce them here for obvious reasons).
On a more happy note: During maintenance I also fixed a few issues with the Bookmarklet which should now do a better job at not coloring all text fields green eventually and at using the target site's jQuery if available.
Windows 2008 / NAT / Direct connections
Yesterday I ran into an interesting problem with Windows 2008's implementation of NAT (don't ask - this was the best solution - I certainly don't recommend using Windows for this purpose).
Whenever I enabled the NAT service, I was unable to reliably connect to the machine via remote desktop or even any other service that machine was offering. Packets sent to the machine were dropped as if a firewall was in between, but it wasn't and the Windows firewall was configured to allow remote desktop connections.
Strangely, sometimes and from some hosts I was able to make a connection, but not consistently.
After some digging, this turned out to be a problem with the interface metrics and the server tried to respond over the interface with the private address that wasn't routed.
So if you are in the same boat, configure the interface metrics of both interfaces manually. Set the metric of the private interface to a high value and the metrics of the public (routed) one to a low value.
At least for me, this instantly fixed the problem.
tempalias.com – debriefing
This is the last part of the development diary I was keeping about the creation of a new web service in node.js. You can read the previous installment here.
It's done.
The layout is finished, the last edges too rough for pushing the thing live are smoothed. tempalias.com is live. After coming really close to finishing the thing yesterday (hence the lack of a posting here - I was too tired when I had to quit at 2:30am) last night, now I could complete the results page and add the needed finishing touches (like a really cool way of catching enter to proceed from the first to the last form field - my favorite hidden feature).
I guess it's time for a little debriefing:
All in all, the project took a time span of 17 days to implement from start to finish. I did this after work and mostly during weekdays and sundays, so it's actually 11 days in which work was going on (I also was sick two days). Each day I worked around 4 hours, so all in all, this took around 44 hours to implement.
A significant part of this time went into modifications of third party libraries, while I tried to contact the initial authors to get my changes merged upstream:
- The author of node-smtp isn't interested in the SMTP daemon functionality (that wasn't there when I started and is now completed)
- The author of redis-node-client didn't like my patch, but we had a really fruitful discussion and node-redis-client got a lot better at handling dropped connection in the process.
- The author of node-paperboy has merged my patch for a nasty issue and even tweeted about it (THANKS!)
Before I continue, I want to say a huge thanks to fictorial on github for the awesome discussion I was allowed to have with him about node-redis-client's handling of dropped connections. I've enjoyed every word I was typing and reading.
But back to the project.
Non-third-party code consists of just 1624 lines of code (using wc -l, so not an accurate measurement). This doesn't factor in the huge amount of changes I made to my fork of node-smtp the daemon part of which was basically non-existant.
Overall, the learnings I made:
- git and github are awesome. I knew that beforehand, but this just cemented my opinion
- node.js and friends are still in their infancy. While node removes previously published API on a nearly daily basis (it's mostly bug-free though), none of the third-party libraries I am using were sufficiently bug-free to use them without change.
- Asynchronous programming can be fun if you have closures at your disposal
- Asynchronous programming can be difficult once the nesting gets deep enough
- Making any variable not declared with var global is the worst design decision I have ever seen in my life especially in node where we are adding concurrency to the mix)
- While it's possible (and IMHO preferrable) to have a website done in just RESTful webservices and static/javascript frontend, sometimes just a tiny little bit of HTML generation could be useful. Still. Everything works without emitting even a single line of dynamically generated HTML code.
- Node is crazy fast.
Also, I want to take the opportunity and say huge thanks to:
- the guys behind node.js. I would have had to do this in PHP or even rails (which is even less fitting than PHP as it provides so much functionality around generating dynamic HTML and so little around pure JSON based web services) without you guys!
- Richard for his awesome layout
- fictorial for redis-node-client and for the awesome discussion I was having with him.
- kennethkalmer for his work on node-smtp even though it was still incomplete - you lead me on the right tracks how to write an SMTP daemon. Thank you!
- @felixge for node-paperboy - static file serving done right
- The guys behind sammy - writing fully JS based AJAX apps has never been easier and more fun.
Thank you all!
The next step will be marketing: Seing this is built on node.js and an actually usable project - way beyond the usual little experiments, I hope to gather some interest in the Hacker community. Seing it also provides a real-world use, I'll even go and try to submit news about the project on more general outlets. And of course on the Security Now! feedback page as this is inspired by their episode 242.
tempalias.com – development diary
After listening to this week's Security Now! podcast where they were discussing disposeamail.com. That reminded me of this little idea I had back in 2002: Selfdestructing Email Addresses.
Instead of providing a web interface for a catchall alias, my solution was based around the idea of providing a way to encode time based validity information and even an usage counter into an email address and then check that information on reception of the email to decide whether to alias the source address to a target address or whether to decline delivery with an "User unknown" error.
This would allow you to create temporary email aliases which redirect to your real inbox for a short amount of time or amount of emails, but instead of forcing you to visit some third-party web interface, you would get the email right there where the other messages end up in: In your personal inbox.
Of course this old solution had one big problem: It required a mail server on the receiving end and it required you as a possible user to hook the script into that mailserver (also, I never managed to do just that with exim before losing interest, but by now, I would probably know how to do it).
Now. Here comes the web 2.0 variant of the same thing.
tempalias.com (yeah. it was still available. so was .net) will provide you with a web service that will allow you to create a temporary mail address that will redirect to your real address. This temporary alias will be valid only for a certain date range and/or a certain amount of email sent to it. You will be able to freely chose the date range and/or invocation count.
In contrast to the other services out there, the alias will direct to your standard inbox. No ad-filled web interface. No security problems caused by typos and no account registration.
Also, the service will be completely open source, so you will be able to run your own.
My motivation is to learn something new, which is why I am
- writing this thing in Node.js (also, because a simple REST based webapp and a simple SMTP proxy is just what node.js was invented for)
- documenting my progress of implementation here (which also hopefully keeps me motivated).
My progress in implementing the service will always be visible to the public on the projects GitHub page:
http://github.com/pilif/tempalias
As you can see, there's already stuff there. Here's what I've learned about today and what I've done today:
- I learned how to use git submodules
- I learned a bunch about node.js - how to install it, how it works, how module separation works and how to export stuff from modules.
- I learned about the Express micro framework (which does exactly what I need here)
- I learned how request routing works
- I learned how to configure the framework for my needs (and how that's done internally)
- I learned how to play with HTTP status codes and how to access information about the request
What I've accomplished code-wise is, considering the huge amount of stuff I had plain no clue about, quite little:
- I added the web server code that will run the webapp
- I created a handler that handles a POST-request to /aliases
- Said handler checks the content type of the request
- I added a very rudimentary model class for the aliases (and learned how to include and use that)
I still don't know how I will store the alias information. In a sense, it's a really simple data model mapping an alias ID to its information, so it's predestined for the cool key/value stores out there. On the other hand, I want the application to be simple and I don't feel like adding a key/value store as a huge dependency just for keeping track of 3 values per alias.
Before writing more code, I'll have to find out how to proceed.
So the next update will probably be about that decision.
Google Buzz, Android and Google Apps Accounts
I was looking at the Google Android Maps Application that is now providing integrated Google Buzz support, showing buzzes directly on the map and allowing you to buzz (around where I live and work, there has been a tremendous uptake of Google Buzz which makes this really compelling).
However, there's a little peculiarity about the Android maps application: If your main Google Account you configured (that's the first one you configure) on the phone is a Google Apps account, Maps will use that for buzz-support (apparently, there's already some kind of infrastructure for inter-company Buzzing in place). This means that you would only see buzzes from other people in your domain and, because there's no official support for this out there, only if they are also using an Android phone.
"Mittelpraktisch" as I would say in German.
The obvious workaround is to configure your private gmail account to be your primary account (this is only possible by factory-resetting your device by the way), but this has some disadvantages, mainly the fact that the calendar on the Android phones only supports syncing with the primary account and as it happens, usually it's the work-calendar (the Apps one) you want synchronized; not the private one (that lingers unused in my case).
To work around this issue, share your work calendar with your private Google account.
Unfortunately, I couldn't do that as I'm posting this, because the default in the domain configuration is to not allow this. Thankfully, I'm that domain's administrator, so I could change it (small company. remember.), but it seems to take a while to propagate into the calendar account.
I'll post more as my investigation turns out more, though it is my gut feeling that this mess will solve itself as Google fixes their Maps application to not use that phantom corporate buzz account.


