Tuesday, April 26, 2011

Top 10 Of All Basic Software Engineer Should Know


The future of software development is about good craftsmen. With infrastructure like Amazon Web Services and an abundance of basic libraries that no longer takes a village to build a good piece of software.

These days, a couple of engineers who know what they are doing can deliver complete systems. In this post, we discussed the 10 concepts software engineers should know to come.

A successful software engineer knows and uses design patterns, actively refactors code, writes unit tests and religiously seeks simplicity. Beyond the basic methods, there are concepts that good software engineers know. These programming languages ​​and projects beyond - are not design patterns, but rather broad areas that you should know. The top 10 concepts are:

1. Interfaces

2. Conventions and Templates

3. Overlap

4. Algorithmic complexity

5. Hash

6. Caching

7. Competition

8. Cloud Computing

9. Security

10. Relational databases

1.INTERFACE:

The main concept of the software interface. All good software to model the real (or imaginary) system. Understand how to model the problem in terms of correct and simple interfaces is crucial. A lot of systems suffering from extreme: urban area, the code is a bit long 'abstractions, or overly designed system unnecessarily complex code and not used.

Among the many books, Agile Programming is by Dr. Robert Martin because of the focus on modeling correct interfaces.

2.CONVENTIONS AND MODELS:

Naming conventions and basic templates are the most overlooked software patterns, but probably the most effective.

The naming conventions for automated software. For example, Java Beans framework is based on a simple naming convention for getters and setters. And canonical URLs in del.icio.us: del.icio.us / tag / software takes the user to a page that has all the elements associated with the software.

Many social software use the same name. For example, your user name, if JohnSmith then likely your avatar is johnsmith.jpg and your RSS feed is johnsmith.xml.

Naming is also used in the tests, for example JUnit automatically recognizes all the methods of the class that begins with the prefix test.The templates are not C + + or Java constructs. We are talking about template files containing variables and then allow binding of objects, resolution, and makes the result for the client.

3.LAYERING:

Layering is probably the easiest way to discuss software architecture. She first received serious attention when John Lakos published his book on a large scale C + +-systems. Lakos argued that software consists of layers. The book introduces the concept of layering. The method is as follows. For each software component, count the number of other components it is based. This is the metric of the component complexity.

Lakos is a good software follows the shape of a pyramid, ie there is a gradual increase in the overall complexity of each component, but not in the immediate complexity. In other words, a good software system consists of small reusable building blocks, each with its own responsibility. In a good system, there are no cyclic dependencies between the elements are present and the whole system is a stack of layers of features, forming a pyramid.

Lakos work was a precursor to many developments in software engineering, in particular refactoring. The idea behind refactoring constantly shape the software to ensure it'is a robust and flexible. Another major contribution was by Dr Robert Martin from Object Mentor, who wrote about the dependencies and acyclic architectures

Among the tools that help engineers the system architecture, structure 101 developed by Headway Software and SA4J developed by my former company, Information Laboratory, and now available from IBM.

Complexity 4.ALGORITHMIC:

There are just a handful of things engineers must know about algorithmic complexity. The first rating is big O. If something is O (n) is linear in data size. O (2 ^ n) are square. Using this notation, you know that search through a list of O (n) and binary search (through a sorted list) is log (n). And sorting of n items would take n * log (n).

The code should (almost) never have multiple nested loops (a loop in a loop in a loop). Most of the code written today should use the Hashtable lists, simple and nested loops.Due only the abundance of excellent libraries, do not focus on the effectiveness of those days. That is fine, the adjustment may occur later, after obtaining right.Elegant algorithm design and performance is something you should not ignore. Write code compact and readable helps ensure your algorithms are clean and simple.

5.HASHING:

The idea behind hashing is fast access to data. If the data is stored sequentially, the time to find the element is proportional to the size of the list. For each element, a hash function calculates a number that is used as an index into the table. Given a good hash function that uniformly spreads data on the table, look, time is constant. Perfecting hashing is difficult to cope with this resolution hashtable implementations support collision.

Beyond the basic storage of data, hashes are also important in distributed systems. The so-called uniform hash is used to uniformly scale the data between computers in a data base of clouds. A foretaste of this technique is part of Google's indexing service each URL is hashed to particular computer. Memcached also uses a hash function allows function.Hash complex and sophisticated, but modern libraries have a good level. The important thing is how hashes work and how to adjust them for maximum performance benefit.

6.CACHING:

No system works without modern web cache, a memory store that contains a subset of information typically stored in the database. The need for cache comes from the fact that the results on the basis of the database is expensive. For example, if you have a site that lists books that were popular last week, want to compute this information once and place it in the cache. user requests to obtain data from the cache instead of hitting the database and regenerating the same information.

Caching comes with a cost. Only certain subsets of data can be stored in memory. The strategy most common data pruning is to build things that are least recently used (LRU). The prunning must be effective, not to slow application.A many modern Web applications including Facebook, rely on a distributed caching system called Memcached, developed by Brad Firzpatrick when working on LiveJournal. The idea was to create a caching system that uses additional memory capacity on the network. Today, it Memcached libraries for many popular languages, including Java and PHP.

7.CONCURRENCY:

Simultaneity is a notorious engineers about making mistakes, and yes understandibly because the brain does juggle many things at once and in schools linear thinking is emphasized. However, competition is important in any modern system.Concurrency on parallelism, but inside the application. Most modern languages ​​have a concept of construction of competition in Java, which is implemented using threads.

A classic example is the competition of producers / consumers, where the producer generates data or tasks, and instead of worker threads to consume and execute. The complexity of the programming contest is derived from are often required to operate on common data. Each thread has its own sequence of execution, but the commonly accessible data. One of the most sophisticated collections of competition has been developed by Doug Lea and is now part of core Java.

8.CLOUD Computing:

In our recent post to get to heaven through clouds Compute Cloud talk about commodity computing is changing the way of delivering large-scale Web applications. Massively parallel, cheap cloud computing reduces the cost and time calculation market.Cloud parallel computing, a concept that many problems can be resolved quickly by performing calculations in parallel.

After parallel algorithms is grid computing, parallel computing was developed in the rest positions. One of the first examples was SETI @ home project of Berkley, which uses CPU cycles to spare data space crisis. Grid computing is widely adopted by financial companies, which run massive risk calculations. The concept of underutilized resources, with the emergence of the J2EE platform, gave rise to the precursor of cloud computing, virtualization of server applications. The idea was to run applications on demand and change what is available depending on time of day and user activity.

Today's most vivid example of cloud computing is Amazon Web Services package available via API. Amazon also offers a service of Cloud (EC2), a database for storing and serving large media files (S3), the indexing service (SimpleDB), and the queue service (SQS). These first blocks already empower an unprecedented way to make large-scale computing, and surely the best is yet to come.

9.SECURITY:

With the growing awareness of piracy and data security is paramount. Security is a broad topic that includes authentication, authorization and transmission.Authentication information "on the control of the user's identity. A typical site requires a password. Initial authentication typically occurs over SSL (Secure Socket Layer), a way to transmit encrypted information over HTTP. The authorization is about permissions and is important in enterprise systems, particularly those that define the workflow. The recently developed OAuth protocol helps web services, allowing users to open access to their private information. This is how Flickr permits access to individual images or data sets.

Another security area is network protection. This includes operating systems, configuration and monitoring to thwart hackers. Not only network is vulnerable, no matter what software is. Firefox browser, marketed as the safest, is to patch the code continuously. To write secure code for your system requires understanding specifics and potential problems.

10.RELATIONAL DATABASES:

relational database recently to have a bad reputation, because they can scale well to support web services by mass. However, this was one of the most important achievements in the IT sector, which we did for two decades and for a long time. Relational databases are excellent order management systems, corporate databases and data of P & L.

At the heart of the concept of a relational database is represented by data records. Each record added to the table, which defines what kind of information. The database provides a way to search for records using a query language, SQL today. The database provides a way to correlate data standardization tables.The more engineering data is about correct ways of partitioning the data among tables to minimize data redundancy and maximize the speed of research.