The cloud is all the hype now, but most people don’t keep in mind that it is not a silver bullet or a panacea. The cloud (or more technically speaking), the use of large scale virtualization is recommended for specific use cases. In this article, we’re going to go through these use cases for which the cloud is recommended. In the final chapter, I am going to emphasize some of the caveats, traps and risks associated with transitioning to the cloud.
Whether we realize it or not, we are already users of cloud services, every time we sign into Gmail, Picassa, Facebook, Linkedin, Office365 or DropBox. This is the most frequent and wide-spread use case for the cloud. For the average consumer, freelancer, small and medium company it does not make sense to invest in and directly operate services such as:
- Email (Gmail for Business, Office 365)
- File Storage (Google Drive, DropBox)
- Document Management (Gmail for Business, Confluence, Office 365)
The providers of such services afford to invest much more knowledge, effort into operating a reliable infrastructure – because of their scale. In other words, if developing service X (let’s say email) would cost $10 million, it would not make sense for a company with 20 employees, but it would make sense for a service provider which has 100,000 such companies as its customers. The advantage of using such cloud services revolve around:
- No up-front cost (like the cost of owning hardware)
- Predictable operating cost (for instance, Google drive sells 1TB for 120$/year)
- Less downtime and better quality of the service
- Better data reliability (as big service providers afford to store the same data in 3 or more places at any one time)
However, one also has to be aware of the risks and down-sides of such services:
- No Internet connection can mean no access to data or to the service. Although this risk can be partially mitigated by having local snapshots of the data, it is worth considering. Nevertheless, in today’s connected business, no Internet often means no business
- No physical ownership of the data. From an operating perspective this is rarely a problem, but this is worth considering from a legal and business continuity perspective.
- Potentially slower access to the data, especially for large files (studio size images, videos), as the Internet is still considerably slower than a local network.
To facilitate the transition from an on-site service to a cloud service, it is always a good idea to do a pilot program with a small team or for a smaller project, so as to have the change to smooth out any bumps in the road with minimal business impact.
You might get an average traffic 1 million page views/hour most of the year, but you might spike somewhere between 10 and 15 view/hour during 5 or 10 days of the year. This is especially common for e-commerce sites around the holidays (winter, spring). The non-cloud solutions would entail either over-scaling your infrastructure just to cope with those 5-10 days of traffic spikes or settling for unsatisfactory performance during the most profitable time of the year. A cloud solution would allow you to grow or shrink your infrastructure depending on these need with one-hour step. Let say you can handle the “normal” traffic with 4 web-servers and 4 application-servers. Simple math would dictate you would need somewhere between 40-60 web servers and 40-60 application servers to handle the peak load (the exact number would depend on application type, your business process, average machine load during “normal traffic). If you were to take this infrastructure on-site, it means that 90% of your cost would be waste during the 355 days of the year when you don’t need the extra-juice. What a cloud provider does is it allows you to only activate and pay for this extra-infrastructure when you need it. In the case were spikes are expected to have a yearly seasonality, it is reasonable to rely on cloud services only during that time of the year. However, many services – such as content streaming – may have a daily seasonality (high traffic during the evenings). The cases where the exhibited traffic seasonality is finer-grained (daily) may require moving the entire web serving solution into the cloud.
Once-in-a-while High Volume Data Crunching
For irregular, high volume workloads, the cloud is also your friend. If every three months, you need to crunch 100TB of data in one big scoop (in 12-48 hour), there’s no reason for you to keep and pay for the required infrastructure for the entire three months. This use case is common in research, where workloads are not periodic and tend to be intensive. Another use case would be reporting. However, you need to keep in mind that large volume reporting workloads (quarterly, yearly reports) can be split into smaller workloads (hourly/daily/weekly roll-ups), which can then be joined/summed up together fairly quickly even on a smaller infrastructure. This way, the effort is split up over time and the quarterly/yearly spikes at the reporting date can be a lot smaller. Top three vendors:
- Amazon Web Service
- Microsoft Azure
- Google Cloud Platform
Static Assets, Geographic Spread and Content Delivery Networks
It just might be the case that an US-based company has a lot of users from Europe or from Asia. In this case, it is highly advisable to use a Content Delivery Network (CDN). A CDN customarily delivers to end-users the static assets from your site (stuff that rarely changes and that is the same regardless of the user): images, CSS files, JS files, videos. The CDN is basically a network of severs (called nodes or edges) around the world which copies the static content from your website (even if your website is NOT cloud-supported) and then serve this static content several times to different users in its geographic vicinity. This achieves several advantages for your website:
- Offloading your main servers: your main server(s) only have to serve the content a few times (once for each CDN node/edge), while that content would then be served thousands of times to the end users.
- Closer means faster: on average, the edge will be geographically closer to the end-user, which means that the round-trip of the data packets will be faster, which means your site will load faster.
- Many is better: most CDNs will allow different requests (files) belonging to the same page from different domains, which means that end-users browser will be able to request more stuff at the same time, which is yet another source of speed up.
It is important to understand that your web servers do not need to be virtualized or cloud-hosted in order for you to take advantage of the advantages of CDN. You can very well setup a CDN over your on-site hosted website. If you have a small website/blog without too many requirements around it, you can very well try a CDN such as CloudFlare for free. They have a tutorial around it (a bit technical, but you’ll live).
Backup and Disaster Recovery
This subject is a touchy one. Cloud storage services usually offer a good-to-great level of reliability/durability for storage (think many 9s after the decimal separator). Amazon S3 promises 99.999999999% (that’s nine nines after the decimal separator). This is far better than what you could possibly achieve on-site, so it makes cloud services an ideal candidate for backing up your data. However, entrusting your data (and customer data, including personally identifying information, credit cards and so on) off-premises may be perceived as too high of a risk or may prove not compliant with security standards for which your company is certified ( such as PCI DSS, ISO/IEC 27001 ). One mitigation for such risk is to encrypt data with a private key before transmitting it to the cloud service. However, this raises the issue of securely and reliably storing the keys (in at least two geographic locations, in order to achieve disaster recovery capability). Thus, in the light version of this use case, one can use cloud storage services for periodically backing up data in a reliable/durable way, preferably with a layer of added encryption. However, in case of disaster, retrieving the data from cloud storage and resuming operations can take days or even weeks, which may prove unacceptable for business continuity. An use case more suitable for more mature companies is to have an up-to-date replicate of their entire infrastructure ready for deployment (but not deployed) in the cloud. This cloud infrastructure could be activated in case of disaster so as to handle the load while the on-site infrastructure is being reinstated, bringing customer facing downtime from days to hours. However, this scenario requires guaranteeing data freshness by more frequent snapshots and several complex disaster recovery test to ensure the cloud infrastructure would be deployed and function as expected.
As no analysis is complete without also emphasize the reasons against a certain solution, one should be careful in considering the following points when planning using or moving to the cloud:
- Cost. At a large scale (hundreds of machines), having an on-site infrastructure and/or a private cloud (on-site virtualization solution) can become more efficient than renting cloud services. However, most companies don’t reach such a scale. Nevertheless, one should always keep a close eye on cost, as the freedom of expanding and shrinking afforded by cloud providers comes with a price. Also, consider that considerable reductions in cost can be achieved by using reservation plans with cloud providers: making a commitment for a certain usage over 1-3 years in exchange for a reduction is cost.
- Data Ephemerality. Unlike physical machines, virtual machines get terminated (read “disappear”) all the time. By default, if they are not configured to use persistent storage (such as Amazon’s Elastic Block Storage or Simple Storage Service), the data on them disappears with them. Make sure your persistent data is actually always stored on persistent media!
- Operational practices. Given the fact that the cloud encourages the use of several (often smaller)virtual machines as opposed to a few ones, it quickly becomes unpractical for operations team to manually configure each machine. That is operations need to focus on automation processes (automatically deploying and configuring machines) and shift away from the “log on to the machine and configure it” view.
- Security. Although cloud providers offers a great array of tools for managing security (firewall, dynamic keys), most system administrators may not be familiarized with these tools and best practices. Make sure your team(s) has a good understanding before moving business-critical data and apps in the cloud.
- Performance/Cost. The virtualization associated with cloud computing comes with a performance penalty, the degree of which may vary greatly depending on the type of application you’re using. That isn’t to say you cannot get the same performance from a cloud machine that you can get from a physical machine (dedicated, bare-metal hosting). It just means that you may end up paying more for it. In other words, you might end up getting less bang for your buck. Be sure to benchmark performance on several instance types and make at least a high-level cost projection. Otherwise, you might end up unpleasantly surprising your CFO.
- Legal/regulatory requirements. Several companies (usually enterprise/corporate size) come under legal and regulatory requirements of not sharing customer data with third parties or of not storing/transmitting customer data outside of the country. Be sure to triple-check with said legal and regulatory requirements or to find a technical solution which does not store/transmit customer data to the cloud provider (for instance, using a CDN for serving static assets from sessionless domains would be a good solution from separating cloud-delivered content from in-house stored customer data).
The key take-away from this post is that cloud services provide a vast array of tools to help in today’s business and technology environment, without however being a one-size-fits-all solution. Either starting up on or transitioning to the cloud requires careful planning and an in-depth understanding of the processes to be implemented as well as of the technology landscape.
As always, any questions, comments and feedback are more than welcome. I’m also open to discussing specific use cases and integration scenarios.
or “The trap of positive thinking and how quitting should be an option”
Have drive, have perseverance. Give 110% percent. Go confidently in the direction of your dreams, because your dreams don’t have an expiry date. Live the life you have imagined. If at first you don’t succeed, try, try and try again.
Does all this motivational stuff sound familiar? About not giving up, about trying harder, about getting up stronger after every punch. Yes, being determined is good for your life and for your career and for your feeling of self-worth. Being focus and relentless helps you get the things you want. But at some point, after the second, or third or tenth attempt of changing something, achieving something – you gotta take a deep breath, stop and ask yourself …
Am I beating a dead horse?
Yeah, are you? Maybe you’re locking in. Maybe you’ve reached the top of a little hill and you’re wondering why you can’t go any higher.
Because regardless whether we’re talking about your job, or your business or your personal projects – there is the off chance that your vision does not match reality. No matter how much you want it, no matter how hard the Universe is conspiring to make your dreams come true, maybe someone else’s dreams are higher priority. Or I dunno, maybe you’re down the wrong path.
I’m just saying that your objectives or your aspirations need a review from time to time, just in case you’re stuck pushing against a dead end. That dead end might be your job – which provides to little opportunity, satisfaction or visibility. Or your business, which is in an industry with zero or negative growth.
The point is that this whole motivational culture actually adds pressure and negative stress to the decisions we make in our career. It makes it shameful to give up, to quit, to admit failure. “How will others see me?” or “What are they going to think?”. The worst part is that “positive thinking” forces us to feel guilt and take responsibility for stuff that isn’t always in our control.
Let’s take an example. You might think the reason why you didn’t get that promotion has to do with your not trying hard enough, not being a team player, not being smart enough, not reaching your objectives. True. Your being awful at what you do is definitely a possibility. Other possibilities include:
- You’re not good at that particular job (although you might be well above average in other jobs)
- You don’t have the same vision as your superior or maybe he just doesn’t like you
- You might not be a good fit in your team
This isn’t saying that you should blame external factors for each and every one of your failure/frustrations. Maybe you should just try something else. Roll the dice out of your comfort zone at least a little bit. The point is to know when to declare failure, when to throw in the towel – without feeling guilty or ashamed.
Of course, this whole post is about giving up and blaming the Universe. This post is about how choosing the problems you solve is as much your responsibility as actually solving them. You should teach yourself how to tell stuff that’s up for you to change (generally what you read, what you eat, who you hang out with, how much money you spend or save and other habits of yours) from stuff that’s not up to you to change (generally what other people read, eat, who they hang out with, how much money they spend or save and other habits of theirs)
Yes, you should focus on doing one thing (job, project, business) and doing it well. Yes, you should try several times with several approaches before you give up. Patience is a virtue – up to a point – then it becomes pathology.
When I was in high school, I read about the great unsolved problems of math, physics, cryptography. I had these geekish dreams of, at some point, proving Riemann’s hypothesis or Goldbach’s conjecture. I thought it would be cool to find the exact general solutions to the Navier-Stokes equations. But then I realized two things: a) doing math problems on paper bored me to death and b) I only thought math was fun if it tackled a practical problems using a computer (so I didn’t have to run the calculations myself). I could have been stubborn and forced myself into something that I hated. But I chose the easy way out: admitting my weakness (I hate repetitive work, I’d rather program a computer to do it) and taking advantage of my strength (I like to but real-world problems in mathematical/numerical models).
All in all, every once I a while you gotta check the pulse and be honest about it.
The last thing you want is to keep beating a dead horse.
Not to mention weird.
If you’re leading a team, big or small, at your startup or in a multinational, if you’re a manager, team lead, project manager or department director, here are a few tips to alienate your team and to make almost sure (within a 95% confidence margin) that they do pee in your coffee:
- Never give a straight answer. Avoiding a clear-cut answer insulates you from the responsibility of something going wrong. Doing this too often will decimate your authority in front of your team and it will cancel out any personal or professional connection you have with them. This doesn’t mean you should pretend to know the answer or force yourself to see the world in black and white all the time. Just try, at least every once in a while, to be honest and utter the magic words “I don’t know”.
- Delegate, but don’t take any responsibility. Even if your team members are responsible for their actions, you should take responsibility for the guidance and the direction you give them. Take responsibility for having them work longer hours when they ask for time off. Take responsibility for making them research or work on something that proves unnecessary. Making them take responsibility and be autonomous does not mean you can blame them for everything.
- Contradicting people when taking feedback. If you ask for someone’s opinion (within your team or outside, employee, contractor or client), make sure to listen to it. Trying to talk back and explain how their feedback is wrong or how they didn’t understand the question almost guarantees that next time they’ll just tell you want you want to hear, counting the seconds until you shut up or leave the room.
- Tell people to look at the bright side when there is none. Being positive about your work helps a lot. Everybody knows that people with better morale work harder, get better ideas and communicate a lot easier. But sometimes things are just plain crappy. Don’t try to sugar coat it, don’t try to present a bad situation as an opportunity. It is true than in Japanese, “crisis” and “opportunity” are one and the same word, but don’t force this on your team unless you have solid arguments, because they will see through your bullshit. And in a crisis, you want your team to be by your side, not to see you as an enemy,
- Refuse to make decisions. It is true that you should let your team self-organize and become autonomous. It is true that you should give guidance instead of orders. This is all true – 90% of the time. But in situations where disagreements on approach, strategy and methodology drag on and slow down your team it is your duty to take charge and set an example. Democracy is a good practice, but some regulation is in order from time to time.
- Constantly crash drive, initiative and performance in the name of team work. Constantly praising some of your team members while ignoring others will definitely ruin the mood. But not praising anyone at all can be even more demoralizing. Moreover, if for every task and for every project you uniformly give credit to the entire team, without emphasizing anyone’s contribution, the high performers will stop caring and the low performers will get comfortable. It’s like workplace socialism – it simply doesn’t converge.
- Make mistakes blur together with achievements, in the name of team work. This isn’t to say you should single out anyone who has made a mistake. But clearly stating, in private, what someone could have done better can send the right signal without making anyone feel like an outsider. Allowing bad performance and bad mistaken for six month or one year (until the scheduled review) can give the impression that it doesn’t matter whether you do or do not care about the result.
- Boast about letting people reach their own conclusions, while manipulating them into agreeing with you. In one word, don’t be a manipulative prick. And if you are one, don’t overdo it. Subtly steering people towards the right conclusion without ordering them is acceptable, within the bounds of common sense. However, trying to make team members you are in disagreement with reach your conclusions without taking responsibility for that conclusion will distance you away from your team and it will dilute your authority.
- Change objectives often. Dissolve vision into short-term tactics. Manipulate principle into fitting your agenda for this week. This week’s top priority is about minimizing cost. Next week’s is about sticking to the timeline. The week after that is about quality of the deliverable. Next month you’ll say micro-management is essential. The month after that is about empowering and guiding, without getting lost in the details. If you change the focus, the principles and the objectives of the team every month or every week, they will all fade to random noise. Focus will become something between not caring and best effort. Principles will be just subtle orders you refuse to take responsibility for. As far as objectives and strategy are concerned, you make everyone by a short-term thinker with no vision. Think twice before changing focus. Don’t allow the transition periods between changes overlap. And if you force change without taking under advisement the input from your team, at least take some responsibility when things go awry.
- Deal with everything high-level. Yeah, don’t get too involved, act like a CEO. Act like you are driven to work in a limousine each morning. If someone comes to you with a problem, don’t give them any hands-on advice. Just stare blankly into their eyes and say “But you’ll fix it, right?”. That will empower them, for sure. If you refuse to get involved in day-to-day, low-level activities – even for a few hours a month – you’ll lose touch with your team. They’ll start seeing you as a pencil-pushing smug bureaucrat instead of a leader. Every once in while, at least for your curiosity and amusement: put yourself in their shoes. Do the things they have to do. Take your own advice, eat your own dog food. See how it tastes like. That will keep your grip over reality. That will keep your from being inside your own team.
That being said, I’m leaving you with Dilbert and I’m looking forward to your comments.
I’m excited about the cloud. Recently, I’ve done some research on Amazon Web Services, especially Elastic MapReduce which allows you to set up your Hadoop cluster in minutes as opposed to weeks or even month. This essentially means that in just a few minutes you can provision an infrastructure of 10-20 or more machine that crunch terrabytes of data per hour. And when you’re done with it, you just click a button, the cluster goes away and you stop getting billed for it. And this is just the beginning. With a few more clicks you can have your own relational database stack, with redundancy, automatic snapshot and a high speed cache of its own. Or you can have endless storage on S3 (or archived storage using Glacier). You can get your a content delivery network (CDN) serving your website data from all over the world, lowering response times and taking the load off application servers. You can build fault tolerant architectures by balancing requests across availability zones (you basically keep one set of machines on the East Coast and another set of standy-by replicas on the East Coast or in Europe).
All in all, the possibilities are endless. Your mind hurts just thinking about what you can accomplish using these tools. You basically have access to what multinational companies only dreamed of in terms of IT infrastructure. The best part is you get charged by the hour, so if you want to scale down or close shop the exit cost is almost zero.
But it’s not all fun and games.
The over-hyped dream cloud has the potential to become one giant storm when you’re not looking. Here are a few things you need to consider:
Downtime and Service Level Agreement
The Amazon instances (EC2 – elastic compute) or the Volume Storage (EBS – elastic block storage) you’re renting aren’t bullet proof. They can fail. They can be wiped out of existence with little or no warning. It’s not the “cute” sort of failure either: you cannot just reboot the machine and hope everything is OK. No, it’s the “whatever data you haven’t saved into S3 storage” disappears. There’s no backup by default. There’s no fail-over by default. And as far as the Service Level Agreement Amazon has, they’re pretty much covered: if you have unplanned downtime of over 0.05%, you can ask for a 10% refund at the end of the year.
This means that you have to employ several strategies to make sure (a) your data is safe and (b) you’re service doesn’t suffer from downtime. All and any of which will cost extra.
I’m writing down a few strategies, also marking the level of paranoia you should reach to employ them. You should also know that the more of these strategies you employ, you’ll row under less operational risk, but with more effort and with higher AWS cost .
- (cautiously optimistic) Periodically backup EBS Volumes into S3. Have it setup automatically.
- (cautiously optimistic) Set a backup policy for any Relational Database Service you may be running.
- (slightly paranoid) Make sure your production EC2 instances can run in a load balanced environment. Set up load balancing between enough EC2 instances so that fail-over can occur automatically and that the remaining EC2 instances can handle peak traffic.
- (paranoid) Set up load balancing across availability zones. This would protect you from downtime of an entire Amazon data center (which have been known to happen, again and again).
- (paranoid) Set up EBS and RDS backups across availability zones. This is the equivalent of disaster recovery.
- (paranoid with flying colors) Using a disaster simulation tool, such as Chaos Monkey (open sourced by Netflix) allows you to make sure your architecture can tolerate random failure in the infrastructure.
Estimating the cost for any of these levels of architecture and infrastructure robustness is not easy feat. It depends on what the use case is. Picking a level of fault tolerance depends on the type and size of your business or application and basically comes down to estimating loss/hour or loss/day in case of a failure
There are no recipes for picking the right balances between cost efficiency and fault tolerance. This post most definitely isn’t aiming at providing the correct answer. The purpose is to get you to ask yourself the right questions.
The cloud infrastructure you’re renting from Amazon isn’t made up of physical machines. This means that between whatever you’re running on your EC2 (operating system, services, application) will be subject to a performance penalty/overhead of the Xen hypervisior. You must understand that EC2 instances are actually virtual private servers and they are inferior in performance to a dedicated server.
Measuring the exact impact of the Xen hypervisor would of course depend on the application you’re running (and on the mix of CPU, RAM and I/O operations required), but there are benchmarks which report EC2 instance having a CPU performance ten times slower than dedicated instances with equivalent prices (the article also reports that I/O operations are 5 times slower on EC2). I would take any such extreme results with a pinch of salt, but the point is that Amazon Web Services add several types of overhead to the applications your running:
- Virtualization overhead from the Xen hypervisor, which commonly impact CPU performance
- Network overhead of communication between EC2 instance and the EBS volume, which are not necessarily in the same physical rack. There are ways around it and improvement being made, based on RAID-ing several EBS volumes together and on provisioned IOPS (which means paying extra for better IO performance)
- Varying performance of EC2 instances (probably depending on the load of other instances provisioned on the same machine)
- Varying performance of network communication between various AWS services (EC2 with S3, RDS, EC2)
To sum it up, there is a price to pay for scalability: that price can be paid either in terms of assuming risk (less predictable performance), in terms of assuming lower performance per instance (an consequently, less value-for-money) or by simply paying more. While AWS is a great service and runs a great business model, you should be aware that having virtually endless infrastructure at your finger-tips is no free lunch and no silver bullet.
Cost of Scale
This is a piece of universal truth in economics: you have to pay extra for the opportunity to change your mind or to worry about something later. The banking and insurance industry are built around that. And the Amazon Web Services business model is built around that. AWS is for IT infrastructure like banking and loans are for money – you get whatever you want now, but you pay interest for the fact you didn’t plan ahead.
AWS solves several problems for you, thus taking away overhead associated with initial investment (i.e. buying your own hardware before you need it and financing it), planning (i.e.figuring out how much you need before you need it), IT infrastructure operations (i.e. someone has to plug-in the server in the rack, install the OS, patch it, update it), risk, asset management and exit costs (i.e. decommissioning, and selling/auctioning of the hardware when you scale down). Moreover, the type of service AWS offers is indispensable for businesses which are subject to seasonal patterns (i.e. traffic peaks on Black Friday, Winter/Spring Holidays, reaching 5-50 times the average day-to-day traffic).
However, this whole care-free attitude which AWS enables comes at significant cost, direct or indirect:
- You pay more for one-hour of CPU than if you rent equivalent or better hardware on a long term commitment.
- You get poorer performance out of an equivalent configuration because of virtualization overhead (less performance per dollar). You can find a cross-provider cloud performance benchmark here.
- There may be hidden costs (or at least, not so obvious costs) such as getting charged for writing from instance to storage volume or getting charged for traffic across availability zones (data centers). Although these charges make sense from a business stand-point, they are easier to miss since most dedicated servers providers don’t charge you for writing to your hard drive.
- There may be higher hidden risks (or at least, not so obvious). If you buy a physical server and you don’t have a periodic backup procedure and preferably another stand-by or active fall-back server, on a long enough timeline you have a strong probability of having downtime and potentially losing data. This is universal, not matter if you use EC2 instances, dedicated servers or on-premise hardware. What is different is that the rate of failure for EC2 instances and EBS volumes is higher than for dedicated servers (Amazon reports it an Annual Failure rate of 0.1-0.5%, but I found no conclusive studies to date). There are fall-back options (like this one), but you need to plan for them and you need to pay for them (paying for additional storage for backup images, paying for load balancing and for a secondary active/passive EC2 instance). So you either pay extra for eliminating this risk or you end up like this guy.
The point is not that “the cloud is evil“. The point is that it has different limitations, costs and risks than one would initially imagine. It is most definitely more suitable for some businesses than for others. Before jumping into the cloud, one should account for the actual business needs in terms of availability, performance and availability, consider reducing cost either by taking some long term commitment or taking acceptable risk and most importantly – architecting infrastructure and operations so as to achieve fault-tolerance.
Why Shoud I Care?
You might think you don’t care. You’re not building cloud apps. You’re not designing cloud architectures. What you do for fun and profit might have nothing to do with IT. Right?
It has more an more to do with it.
The more you store your pictures online on Instagram or Deviant Art instead of burning them on DVDs. The more you keep your contacts and your connections on Facebook and Google. The more you use Gmail or Yahoo. The more you have your finances and your documents online on Google Docs or Office 365. The more you move stuff out of your house and out of your computer and out of your business into the cloud, you do care.
The fact that I did research on Amazon Web Services lately is purely a coincidence. I came up with the idea for this post after one very simple thing happened: some of the documents from my Google Drive disappeared. For most of them, I had no backup. Fortunately, they got restored a few months after I reported the issue. But it was enough to get me thinking.
I’m not arguing either for or against working in a big corporation. It’s a choice that everyone should make for themselves, with their own context and priorities in mind. The purpose of writing this article is to make sure that you make an informed decision. So here are 10 things you should consider – for yourself.
Here’s stuff to consider:
- Nobody wants to take responsibility for any chance or for any remote risk. Decisions are left out to teams, which, in my opinion is the slowest, most ineffective process, which yields the lowest degree of accuracy. The alternatives would be allowing decisions be made by someone with knowledge, someone with power or (ideally) someone with both. The first reason why this doesn’t happen is that people with knowledge do not know the political intricacies within a corporation (one hand scratches the other style, eg. “I’ll pass you some of my budget if you promote my project, which in turn might get me promoted”), and so they might make the “unpopular” decision (eg. picking one vendor over another, promoting one person’s idea over another’s and so on). The second reason is that people with enough power to make or at least to steer a decision don’t have the knowledge, the time or the will to understand its implications, its long term impact. Between these two reasons, upper and middle management force the lower drones to be in consensus over each decision, so as to protect upper and middle management from decision-making, responsibility and risk. And the cost for this policy is that no decision is clear-cut, everything is a compromise – which works wonders for customer/user experience.
- Procedure over habit, habit over reason and over common sense. What you have to understand is that it doesn’t matter if you’re right. All that matter is if you can be proven wrong. And procedures (or protocols, or bureaucracy) provide the shelter against being wrong. Ultimately, it doesn’t matter if you made a good call or a bad call, as long as you made the call you where supposed to make, by respecting procedure. This creates countless situations where the actual clear, simple, straight-forward approach is unacceptable because it would violate procedure and the procedural way takes 10 times more. It gets even more interesting as on the procedural chain there are a lot of people either too busy, too uninterested, too incompetent or too afraid of doing something wrong. Procedure isn’t the greatest enemy, however. Habit is. A lot of people don’t want to try new things because they have a little comfort zone in their way of doing things (no matter how slow, inefficient or obsolete)
- (For Romania only) Taking English words for granted and copying them into the native language.
- “Leverage” becomes “levier” (the actual word: “parghie”).
- “To commit” becomes “a comite” (the actual word: “a-si asuma”)
- “To assume” becomes “a asuma” (the actual word: “a presupune”)
- “To assess” becomes “a asesa” (the actual word: “a evalua”)
- Leave your comment with your own corporate bullshit line or roenglish word
- Your ideas aren’t rejected, they are prioritized. Telling someone their idea sucks or it’s irrelevant or there’s no time and no money for it takes guts. And guts isn’t something you find too often. When you say that someone’s idea isn’t a good idea you actually take the responsibility. But you don’t want that responsibility, so you delay it – indefinitely. The context isn’t right, the budget isn’t right, the load of the development or testing team is too high, there are more important initiatives/projects in the pipeline. NO – when someone tells you “this can wait”, you read “this can wait until next month”, but what it actually means is “this can wait until pigs fly/I quit this goddamn job/I move into another position”. It means “I don’t want to make the extra effort” or “I don’t want you to take the credit” or “I don’t want to take the gamble on your idea”. If there’s one rule that applies is the rule of precedents: if it can wait this month, it will be able to wait next month, and the one after that.
- Everybody wants to be a manager of some kind. You can’t be a shelf boy – the politically correct term is “shelf stocking manager”. It gives you (the employee) the illusion that you’re doing something important. Of course, your so-called titled wouldn’t be worth the piece of paper it were written on in the outside world – but every morning, when you get your coffee from the machine, it gives you the illusion of meaning, of purpose, of responsibility. And let’s be honest: giving people fancy worthless job titles is a lot cheaper than giving them an actual raise.
- Gratefulness has a short memory span and promises aren’t worth the piece of paper they could have been written on. Your company owes you so much. If it weren’t for your commitment, your involvement, your overtime, your blood, sweat and tears – we wouldn’t have pulled it off. “Good” said the Grumpy Cat “In light of that, can I have my raise and/or promotion now?” Not in six months or three quarters down or at the end of the next financial year. “Now” would be a good time. ‘Cause by next pay check, you won’t even remember my name or what it was I did so great. You won’t remember my extra-effort or my involvement. Which kinda makes sense, because by then you’ll have other problems to worry about. The truth is it doesn’t take a raise to let you know someone notices. It takes a VP shaking your hand or taking 5 minutes to come over lunch. It takes a couple of tickets to a concert or to a spa. It a take a small bonus per project. It doesn’t matter what your standards are for being reward. The point is that if something tangible, real or written does not happen within two weeks after your great corporate achievement everybody will have forgotten about it (including your colleagues, your boss and most definitely your boss’ boss).
- Politics are more important than hard work and knowledge. Look, corporations aren’t this sort of Borg-like highly-integrated collective intelligence. No. They’re made of individual people. And most people don’t wake up thinking “what would the customer want?”. People want a raise. A vacation. Their percentage or their commission. A promotion. Recognition. A big-screen smart-TV. And people realize that they should help each other achieve those purposes in ways that are not necessarily the best for the company, the customer or the environment. Oftentimes, what makes sense rationally or what yields lowest cost or highest result in the long run isn’t necessarily what’s best for the career and lifestyle of your superiors and peers. The right decision might get them more work to do, less visibility, more responsibility, more risk of someone asking them why they fucked up. More importantly, what you decide or do might make their promises or allegiances to others not stick, which would hurt their career. So no matter how much work you do, how up-to-date you keep yourself on industry topics or how well you seek the interest of the company, having coffee/lunch with the right people always matters more. In a way, corporations are like high schools: there are “the losers” who attend to actually learn something and there are the “cool kids” who have a club of their own. Forget your Inbox. Football jokes over coffee and going out after work are far more important.
- If there isn’t a KPI for it, then it doesn’t matter. Corporations are places where people who have no idea what you’re doing have to decide whether to punish you, reward you or (most frequently) just ignore you. The tools to do that are KPIs, objectives and reviews. Long story short: if your achievement/extra-effort from during the year is not on your little objectives sheet that had been written at the beginning of the year, there’s little chance anyone will notice/care. Which kinda makes sense, until you are confronted with situations where entire teams argue whether to actually improve customer experience or to improve the customer experience metrics (so the experience stays the same, but it looks better in reports). A lawyer once told me there are two kinds of truth: the truth you know and the truth you can prove. Let me just tell you, in corporations, the truth does not exist.
- What can make your job easier will also make you irrelevant. I love it when people get excited by the team being supplemented with some new contractors. Or by that piece of software which automatically scans invoices so you barely have to type anything in. Yeah. Great. All great organizational announcements about making your job easier, faster and more efficient should be a subtle queue that you should already be looking for another job. While you might naively rejoice in the thought that someone will pay you the same money to do less work, you should know that the main purpose of a business isn’t to keep employees happy, but to keep them at 99% capacity (120% if possible) and to shrink costs. If there where 20 people inputting invoices, this brand new system makes it possible for things to get done with just 5 people
- Promises travel and get inflated upstream. Blame travels and gets inflated downstream. Your boss asks you what you think of the new piece of software. You tell him it’s crap, you give him a long list of reasons and suggest several alternatives, which he ignores. After that, his boss asks him what he thinks. He says, it’s a great piece of software, state-of-the-art, best-of-breed, industry-standard, which has some issues, but they will be fixed. The downside is constantly under-estimated as information travels upstream though countless layers, until ultimately the CTO/CEO/CxO think it’s a great product and everything is going according to plan. Then, when things actually do break, customers start ringing the phone off the hook, the CTO/CEO/CxO are dissatisfied and ask what’s wrong. The next level says “I thought the issues were going to be solved”. Finally, your boss tells you “It was your job to fix this”. See what happens: the information flow between decision and execution is broken. The orders are emitted with too little knowledge and too little context and the feedback being received has more noise than actual information.
To sum it up, I’m not saying “Don’t work in a corporation”. It can be a good thing, especially if you want a decent salary, little responsibility and low short-medium term risk. All I’m saying is you should be aware of the downsides and shouldn’t drink the Kool-Aid. Know fact from bullshit and don’t lie to yourself on behalf of the company (“they’re gonna make me Executive Shelf Stocking Manager next year”). Don’t take empty promises and beware of fake principles which people use to push their agenda. Try to work in a profit center instead of a cost center, because bringing in money gets you a better image. Failing that, try to work in a department which is at the business’ core, because you’ll matter more. And while having a good personal relationship with co-workers is great, remember that you’re doing your job for the money – making friends is a secondary objective. Oh and… if you really want a raise, your best chance is to find yourself another better-paying job.
Looking forward to your – preferably hateful – comments.
And considering this is a post about corporations, it should also have a legal disclaimer, shouldn’t it? So here it is J
The events and opinions depicted in this article are fictitious and purely represent the opinion of the author. Any similarity to any person living or dead, or to any organization is merely coincidental.
We hear a lot about cloud infrastructure and scaling out, but the truth is that no matter how much hardware and infrastructure (be it physical or virtual) you have, the bottleneck usually lies in application or database setup and configuration. In this article, I’m going to outline the general directions for optimizing web server performance. In following articles I will follow up with more detailed suggestions. I will assume you’re using Apache HTTPD as a web server. Most of the optimizations paths outlined here are also valid for other web servers (nginx, lighttpd), although they might come under different names and flavors.
Here’s the stuff you should check:
1. Make sure you have KeepAlive enabled.
This is done by setting
This setting allows you to prevent network overhead for establishing a new connection for every single HTTP request. With KeepAlive set to On, the browser just establishes one connection with the web server and than it transfers all the request/response pairs over that connection. If KeepAlive were Off, the browser would have to reestablish the connection for every single request. As a ball park estimate, the time it takes to set up a connection is equivalent to the ping time between the client machine and the server machine (5-50ms). Furthermore, you can estimate that setting KeepAlive On will shave up approximately
Time Saving = (Number of Request per Page) * (Ping Time) / (Parallel Connections from Browser)
Most browsers set up 6 parallel connections. Considering the ping time around 20ms for a page with 50 requests, that adds up to 20*50=166 ms shaved off every page with changing just one line of configuration.
Also, be careful with KeepAliveTimeout. This defines for how long a TCP/IP connection is being kept open. The downside of setting this parameter too high is that slow clients will unnecessarily keep resources blocked on your server (memory, file descriptors). The default value for this is 15 seconds. As a rule of thumb, don’t go below 5 seconds or above 20 seconds.
2. Make sure your Apache HTTPD process is running mod_worker (instead of mod_prefork).
Apache can naturally serve more requests at the same time. But there is additional performance to be leveraged depending on how those parallel requests are served.
mod_prefork essentially means that Apache spawns several httpd processes, with each process serving one single request at a time.
mod_worker means that Apache spawns several httpd processes, with each process having several threads capable of serving one request at time.
The main difference is that processes are more expensive than threads. They are more expensive in memory (as the entire process needs to be duplicated, instead of just the thread stack) and they are more expensive for the CPU (which spends more time switching between process contexts than between thread contexts).
Before you read on, you should know that mod_worker is not naturally compatible with all Apache extensions, especially with the all popular mod_php. This happens because mod_php is not thread safe – that is, it’s built for running in its own process, not in a thread. To go around this limitation, you should use FastCGI and compile PHP to include FastCGI libraries.
In order to run Apache httpd in mod_worker, you need to modify the file
by adding (or removing the # from) the following line:
If you’re using PHP, you might get the following error:
Apache is running a threaded MPM, but your PHP Module is not compiled to be threadsafe. You need to recompile PHP.
In this case, see the above link about using FastCGI to allow PHP to run in its own memory space.
3. Fine tune your Multi-Processing Module
Activating mod_worker as your MPM is a good start, but it’s not enough. The number of threads and processes greatly depends on your configuration, especially on the number of CPU/vCPU cores the machine has.
Below you’ll find an example of a MPM worker configuration. Please note that this isn’t a universal solution. You should also read the explanation from Apache here and here and always do you own testing.
<IfModule worker.c> ServerLimit 4 StartServers 2 MaxClients 128 MinSpareThreads 15 MaxSpareThreads 35 ThreadsPerChild 32 MaxRequestsPerChild 1000 </IfModule>
Let’s see what every directive means
- ServerLimit is the maximum number of httpd processes that will run.
- StartServers is the number of httpd processes that are first started when you run Apache.
- MaxClients are the maximum number of clients which are served at any one point, simultaneously.
- MinSpareThreads, MaxSpareThreads are the minimum and maximum number of idling threads which are to be available at any one point.
- ThreadsPerChild defines how many threads each process hosts. 16 or 32 is a good idea and you probably shouldn’t go above that.
- MaxRequestsPerChild is the maximum number of HTTP connections which are being served by a process before respawning (killing and restarting) a httpd process. This should probably lie between 1000 and 10000 requests. It is a useful feature for limiting memory thrashing, leaks or other related performance degradation issues. Note that having KeepAlive On makes httpd count the number of connections (which can contain several requests), not the number of requests.
You should consider these following rules of thumb:
- MaxClients < ServerLimit*ThreadsPerChild
- Make sure that ServerLimit < ThreadsPerChild . Always consider scaling out the number of threads before the number of processes. The reason for this is that process switching is more CPU-expensive than threads switching (mostly because threads share the same memory space)
- ServerLimit should be at least equal to the number of cores, but no larger than four times the number of cores. I recommend going for two times the number of cores (unless your tests show different results).
- StartServer should be half of ServerLimit.
- ThreadsPerChild should be between 16 and 32 requests.
- Unless you have good reason (your own tests), don’t exceed 32 threads per process.
So there you have it.
Always remember: your own research and your own tests on your own scenarios are always more relevant than what I or anyone else advise.
In Part II of the article, we’ll get into details about caching:
4. Mark all static resources as cacheable
5. Leverage mod_disk_cache and ramfs to achieve in memory caching
6. Leverage database query caching
In the last part of the article, we’ll be covering:
7. Minify and join JS and CSS files.
8. Use CSS image sprites as much as possible
9. Precompile PHP files
10. Mount pid files on ramfs
FQL Console is a simple Facebook application allowing you to learn more about your friends, by answering questions like:
- Who are the single girls from my friends?
- Who are the single guys ?
- What girls are in complicated relationships?
- What girls are in open relationships?
- Who among your friends in an open relationship?
Of course, you can ask questions completely unrelated to relationship gossip, including:
Of course, if you’re a little bit geeky – you can write your own queries: just click the green button on the right reading “Advanced: Write and execute your own query”. There you can type in your own query, modify existing queries and execute them. Moreover, if a result is returned without error, you can save the query so others an use it too!
If you think relationship gossip is boring, you can tackle other subjects, including words appearing in post, photos, tags. I haven’t explored all the possibilities myself , so feel free to try queries, share them over with your friends. I would say sky is the limit, but actually FQL(Facebook Query Language) is the limit – Facebook decides which queries are valid, depending on their syntax, complexity and your friends’ privacy settings. I won’t bore you with all the techee details, I’m sure you’ll dive into them if you feel like it. Let’s just say FQL is a simplified, underfeatured version of SQL.
I put in about 12-15 hours into the concept & development of FQL Console.
So it was pretty much a weekend project for both of us.
Four reasons why we built it:
- It was fun, a great opportunity to better understand the Facebook API.
- I wanted a simple way to view stats about my 500+ friends (i.e. relationship gossip unleashed)
- I wanted to unleash the power of FQL to users who aren’t tech savvy and don’t want to get their hands dirty with code.
- Social experiment
FQL Console needs access to some of your data (especially your friends’ data) so it can run queries.
Any information obtained by FQL Console will not be shared with third parties and will not be used for commercial/marketing purposes. Some information may be temporarily stored on our server to improve performance of the application.
We ask for your email and permanent access to your data because we plan to unleash an email notification feature for query changes (“Notify me if there are new single girls on my profile”). We won’t spam. Any email we send you (if any) will have a click-once unsubscribe link.
Sharing a query does NOT mean sharing the answer to that query. So for instance, the query “Who are the single girls from my friends?” returns Mary, Ann and Carol on my profile. But if I share the query (the query link) with my good friend, Radu, he will just see the answer to the same question for his profile (let’s say Hannah, Beatrice and Joelynn). FQL Console does not facilitate or encourage sharing or otherwise publishing the info from your profile with others.
Last but not least, FQL Console is not a hack, it respects Facebook ToS and it does NOT give you access to additional information. It is just a way to get a different perspective on the Facebook info you already have access to.
If you enjoyed it…
Tell your friends to tell their friends about FQL Console
Try your own queries. Share them.
Drop us a line. Or a comment.
For the past few days, I’ve been spending my spare time working on an simple and fun app that allows everyone to browse their Facebook friends in a more interesting fashion and find out new things about their friends.
While the app is functional, I’m still crushing some usability bugs.
I’m not a designer, so don’t expect anything too breath-takingly eye-catching.
I am however a very curios person, so do expect something that satisfies your curiosity.
And don’t worry, it’s nothing against the Facebook ToS.
Will make it public within the next week, I’m very excited to know what you think.
This article describes my personal vision of the top three hot topics in technology in 50 years. If part one of the article focused on nanotech and the two development branches for it (electronic and biological), this part will focus on a particular high-impact application of nano-tech: neural interfaces.
Neural interfaces (i.e. brain computer interfaces) are the destination of our technological journey which brings people and machines closer together. Computers were initially programmed by manually rewiring components, then by keyboard and mouse and nowadays by touch and gesture. However, the interface begins to be more and more the bottleneck in man-computer processing. In other words, people think faster and computers tend to work faster than you can type or touch.
In the diagram below, I’ve listed the basic action the human mind and the computer need to perform when a user thinks of looking up the entity that performs a “meow-meow” action.
The colors are meant to both link actions on the vertical and to highlight how much time they take (warm color = takes a lot of time).
As you can see, actually moving your fingers over the keyboard/touchscreen takes up the most time. Also (subconsciously) processing how to move the fingers is quite a lengthy process.
Be that as it may, the purpose of neural interfaces is to remove the need to think about moving the fingers and to actually move them and send the information directly to the machine via a much faster interface. This would make all querying and commanding processes a lot faster, by bringing them from a several seconds to under a second. It may not seem like a significant gain, but it builds up over a lot of commands.
Of course, this is not the only advantage. The other one would be being able to convey more information directly to the system. In the example above, consider you weren’t looking for cats, but for cute, white cats with black spots. Typing this would take a lot longer and it probably wouldn’t yield all the relevant results – because the more complex an idea is the more ways there are to express it (in natural language).
The sum it up, the point of neural interfaces is to make the communication process between humans and machines faster, more complete, more reliable. Of course, the same way even the communication process between humans and humans would benefit from the same advantages (if language is by-passed).
I know, I know – this whole thing sounds kind of like science fiction. But there is one piece of the puzzle, one technology needed in order to ignite the development in neural interfaces.
And that is – pam pam – nanotechnolgy.
Nanotechnology is the tool needed to physically build these interfaces between biological and digital. These interfaces would be extremely complex in structure, very fine and very small (think nano). With present technology, we have no way to build such fine connections, with complex structures to fit neural pathways. But hopefully with the help of nano-bots, this will start being possible in 25-30 years.
Once the hardware is available (even in a crude form), data will be collected and software will emerge. Unlike most fields, the problems posed by neural interfaces are more on hardware than on software.
Moreover, some crude experiments already exist, such as Lateral Geniculate Nucleus Cat Vision – allowing the transmission of a video feed from the cat brain. Also, similar projects exist for capturing the vision feeds of rats and insects. But all of these attempts are more or less like attempting brain surgery with slaughterhouse equipment. And, just to emphasize, the equipment is hardware.
Needless to say, there are countless ethical considerations to these future development, including consent, privacy, mind control, challenging our very sense of self and identity. This will probably be the 2060 equivalent of web privacy concerns.
The applications however are endless:
- Artificial telepathy
- Shared empathy
- (hang on for this one) The World Wide Mind
However, until this science fiction even begins to approach reality, we need to develop nano-bots able to build complex, microscopic structures – such as neural pathways. Nanotech alone will take another 25 years to reach this point, so I expect the first generation of neural interface to emerge 30-35 years from now.
Hang on for part 3 of the series.
Until then, I’m looking forward to your thoughts, as always.