Code Answer: 02/03/11

Thursday, February 3, 2011

What methods of caching, other than to file or database, are available?

Currently I know of only two ways to cache data (I use PHP but I assume that the same will apply to most languages).

Save the cache to a file
Save the cache to a large DB field

Are there any other (perhaps better) ways of caching or is it really just this simple?

From stackoverflow Teifion

You can also cache in memory which is much more efficient. Try memcached.

From Sebastjan Trepča
Maybe you want to explicit more precisely what you want to cache. You have all this opportunities to cache:
- Accessing the Data Base where you cache the data first correctly tuning your RDBMS, then using a layer to delegate the decision to detect multiple queries for the same data (with AdoDB for example.)
- Extracting calculations from loops in the code so you don't compute the same value multiple times. Here your third way: storing results in the session for the user.
- Precompiling the PHP code with an extension like APC Cache. This way you don't have to compile the same PHP code for every request.
- The page sent to the user making sure you're setting the right META tags (do a good thing for the world and don't use ETL at least absolutly necessary); or maybe making dynamic pages completely static (having a batch process that generates .html pages); or by using a proxy cache like Squid.
- Prefetching and by this I refer all those opportunities you have to improve the user experience just by doing things while the user don't look your way. For example, preloading IMG tags in the HTML file, tunning the RDBMS for prefectching, precomputing results storing complex computations in the database, etc.
From my experience, I'd bet you that your code can be improved a lot before we start to talk about caching things. Consider, for example, how well structured is the navigation of your site and how well you control the user experience. Then check your code with a tool like XDebug.

Verify also how well are you making your SQL queries and how well are you indexing your tables. Then check your code again to look for opportunities to apply the rule "read many times but write just once"

Use a simple tool like YSlow to hint other simple things to improve. Check your code again looking for opportunities to put logic in the browser (via JavaScript)

From ggasp
Seconding memcached, does the simple stuff well and can go distributive and all that jazz if you need it too

From Kevin Sheffield
If you're using Apache, you can use mod_rewrite to statically cache your web pages. Lets say you're using PHP, and you have a request for "/somepage.php". In your .htaccess file you put the following:
```
RewriteEngine on
RewriteCond %{QUERY_STRING} ^$ # let's not cache urls with queries
RewriteCond %{REQUEST_METHOD} ^GET$ # or POST/PUT/DELETE requests
RewriteCond static_cache/%{REQUEST_URI} -s # Check that this file exists and is > 0 bytes
RewriteRule (^.*$) static_cache$1 [L] # If all the conditions are met, we rewrite this request to hit the static cache instead
```
If your cache turns up empty, the request is handled by your php script as usual, so now it's simply a matter of making your php script store the resulting html in the cache. The simplest way to do this is using another htaccess rule to prepend end append a couple of php files to all your php requests (this might or might not be a good idea, depending on your application):
```
php_value auto_prepend_file "pre_cache.php"
php_value auto_append_file "post_cache.php"
```
Then you'd do something like this:

pre_cache.php:
```
ob_start();
```
post_cache.php:
```
$result = ob_get_flush();
if(!$_SERVER['QUERY_STRING']) { # Again, we're not caching query string requests
  file_put_contents("static_cache/" + __FILE__, $result);
}
```
With some additional regular expressions in the .htaccess file we could probably start caching query string requests as well, but I'll leave that as an exercise for the reader :)

From Erlend Halvorsen

ConfigurationManager.AppSettings Performance Concerns

I plan to be storing all my config settings in my application's app.config section (using the ConfigurationManager.AppSettings class). As the user changes settings using the app's UI (clicking checkboxes, choosing radio buttons, etc.), I plan to be writing those changes out to the AppSettings. At the same time, while the program is running I plan to be accessing the AppSettings constantly from a process that will be constantly processing data. Changes to settings via the UI need to affect the data processing in real-time, which is why the process will be accessing the AppSettings constantly.

Is this a good idea with regard to performance? Using AppSettings is supposed to be "the right way" to store and access configuration settings when writing .Net apps, but I worry that this method wasn't intended for a constant load (at least in terms of settings being constantly read).

If anyone has experience with this, I would greatly appreciate the input.

Update: I should probably clarify a few points.

This is not a web application, so connecting a database to the application might be overkill simply for storing configuration settings. This is a Windows Forms application.

According to the MSDN documention, the ConfigurationManager is for storing not just application level settings, but user settings as well. (Especially important if, for instance, the application is installed as a partial-trust application.)

Update 2: I accepted lomaxx's answer because Properties does indeed look like a good solution, without having to add any additional layers to my application (such as a database). When using Properties, it already does all the caching that others suggested. This means any changes and subsequent reads are all done in memory, making it extremely fast. Properties only writes the changes to disk when you explicitly tell it to. This means I can make changes to the config settings on-the-fly at run time and then only do a final save out to disk when the program exits.

Just to verify it would actually be able to handle the load I need, I did some testing on my laptop and was able to do 750,000 reads and 7,500 writes per second using Properties. That is so far above and beyond what my application will ever even come close to needing that I feel quite safe in using Properties without impacting performance.

Thank you, everyone, for your helpful answers.

From stackoverflow Dylan Bennett

Someone correct me if I'm wrong, but I don't think that AppSettings is typically meant to be used for these type of configuration settings. Normally you would only put in settings that remain fairly static (database connection strings, file paths, etc.). If you want to store customizable user settings, it would be better to create a separate preferences file, or ideally store those settings in a database.

From bcwood
Could I ask why you're not saving the user's settings in a database?

Generally, I save application settings that are changed very infrequently in the appSettings section (the default email address error logs are sent to, the number of minutes after which you are automatically logged out, etc.) The scope of this really is at the application, not at the user, and is generally used for deployment settings.

From Shawn Simon
one thing I would look at doing is caching the appsettings on a read, then flushing the settings from the cache on the write which should minimize the amount of actual load the server has to deal with for processing the appSettings.

Also, if possible, look at breaking the appSettings up into configSections so you can read write and cache related settings.

Having said all that, I would seriously consider looking at storing these values in a database as you seem to actually be storing user preferences, and not application settings.

From lomaxx
I should probably clarify a few points.

This is not a web application, so connecting a database to the application might be overkill simply for storing configuration settings. This is a Windows Forms application.

According to the MSDN documention, the ConfigurationManager is for storing not just application level settings, but user settings as well. (Especially important if, for instance, the application is installed as a partial-trust application.)

From Dylan Bennett
Check out SQLite, it seems like a good option for this particular scenario.

From Shawn Simon
I would not use config files for storing user data. Use a db.

From Kevin Sheffield
since you're using a winforms app, if it's in .net 2.0 there's actually a user settings system (called Properties) that is designed for this purpose. This article on MSDN has a pretty good introduction into this

If you're still worried about performance then take a look at SQL Compact Edition which is similar to SQLite but is the Microsoft offering which I've found plays very nicely with winforms and there's even the ability to make it work with Linq

From lomaxx
Dylan,

Don't use the application config file for this purpose, use a SQL DB (SQLite, MySQL, MSSQL, whatever) because you'll have to worry less about concurrency issues during reads and writes to the config file.

You'll also have better flexibility in the type of data you want to store. The appSettings section is just a key/value list which you may outgrow as time passes and as the app matures. You could use custom config sections but then you're into a new problem area when it comes to the design.

From Kev
The appSettings isn't really meant for what you are trying to do.

When your .NET application starts, it reads in the app.config file, and caches its contents in memory. For that reason, after you write to the app.config file, you'll have to somehow force the runtime to re-parse the app.config file so it can cache the settings again. This is unnecessary

The best approach would be to use a database to store your configuration settings.

Barring the use of a database, you could easily setup an external XML configuration file. When your application starts, you could cache its contents in a NameValueCollection object or HashTable object. As you change/add settings, you would do it to that cached copy. When your application shuts down, or at an appropriate time interval, you can write the cache contents back out to file.

From Terrapin

How to learn ADO.NET

I need to learn ADO.NET to build applications based on MS Office. I have read a good deal about ADO.NET in the MSDN Library, but everything seems rather messy to me.

What's the basics one must figure out when using ADO.NET? I think a few key words will suffice to let me organize my learning.

From stackoverflow $pageUsers[$entry.posts[0].author].name

There are three key components (assuming ur using SQL server):

SQLConnection, SqlCommand, SqlDataReader

(if you're using something else, replace "Sql" with "Something", like "MySqlConnection", "OracleCommand"

Everything else is just built on top of that Example 1:

using (SqlConnection connection = new SqlConnection("CONNECTION STRING"))
using (SqlCommand command = new SqlCommand())
{
  command.commandText = "SELECT Name FROM Users WHERE Status = @OnlineStatus";
  command.Connection = connection;
  command.Parameters.Add("@OnlineStatus", SqlDbType.Int).Value = 1; //replace with enum
  connection.Open();
  using (SqlDataReader dr = command.ExecuteReader))
  {
      List<string> onlineUsers = new List<string>();
      while (dr.Read())
      {
         onlineUsers.Add(dr.GetString(0));
      }
  }
}

Example 2:

using (SqlConnection connection = new SqlConnection("CONNECTION STRING"))
using (SqlCommand command = new SqlCommand())
{
  command.commandText = "DELETE FROM Users where Email = @Email";
  command.Connection = connection;
  command.Parameters.Add("@Email", SqlDbType.VarChar, 100).Value = "user@host.com";
  connection.Open();
  command.ExecuteNonQuery();
}

From Karl Seguin

Another way of gettinga command object is to call "connection.CreateCommand()". That way you shouldn't have to set the Commection property on the command object.

From Mark Cidade

Passing multidimensional arrays as function arguments in C

In C can I pass a multidimensional array to a function as a single argument when I don't know what the dimensions of the array are going to be ?

In addition my multidimensional array may contain types other than strings.

From stackoverflow David

Yes. Take a look at this page.

From Jeremy Privett
Simply put:

An array is a pointer to a vector in memory. So, an array of arrays is a pointer to a pointer to a vector in memory.

So:
```
/**
In a near Galaxy you have
**/
char a[100][100];
giveMeTheAnswer( a ); 

/**
The event horizon must accept that multiverse
**/
void giveMeTheAnswer( char** pMatrix ) {
        pMatrix[0][0] = '4';
        pMatrix[0][1] = '2';
};
```
Johannes Schaub - litb : yeah that is totally wrong :) try pMatrix[1][0] = '4'; :)

newacct : Wow that is so wrong. You completely confused arrays of arrays and arrays of pointers. They are completely different.

John Bode : The type of pMatrix is `char (*)[100]` (pointer to 100-element array of char), not `char **`. Secondly, *arrays are not pointers*. Arrays and pointers are *different* objects that, in some contexts, can be used interchangeably.

From ggasp

You can do this with any data type. Simply make it a double pointer:

typedef struct {
  int myint;
  char* mystring;
} data;

data** array;

But don't forget you still have to malloc the variable, and it does get a bit complex:

//initialize
int x,y,w,h;
w = 10; //width of array
h = 20; //height of array

//malloc the 'y' dimension
array = malloc(sizeof(data*) * h);

//iterate over 'y' dimension
for(y=0;y<h;y++){
  //malloc the 'x' dimension
  array[y] = malloc(sizeof(data) * w);

  //iterate over the 'x' dimension
  for(x=0;x<w;x++){
    //malloc the string in the data structure
    array[y][x].mystring = malloc(sizeof(char) * 50); //50 chars

    //initialize
    array[y][x].myint = 6;
    array[y][x].mystring = "w00t";
  }
}

The code to deallocate the structure looks similar - don't forget to call free() on everything you malloced! (Also, in robust applications you should check the return of malloc().)

Now let's say you want to pass this to a function. You can still use the double pointer, because you probably want to do manipulations on the data structure, not the pointer to pointers of data structures:

int whatsMyInt(data** arrayPtr, int x, int y){
  return arrayPtr[y][x].myint;
}

Call this function with:

printf("My int is %d.\n", whatsMyInt(array, 2, 4));

Output:

My int is 6.

From superjoe30

Pass an explicit pointer to the first element with the array dimensions as separate parameters. For example, to handle arbitrarily sized 2-d arrays of int:

void func_2d(int *p, size_t M, size_t N)
{
  size_t i, j;
  ...
  p[i*N+j] = ...;
}

which would be called as

...
int arr1[10][20];
int arr2[5][80];
...
func_2d(&arr1[0][0], 10, 20);
func_2d(&arr2[0][0], 5, 80);

Same principle applies for higher-dimension arrays:

func_3d(int *p, size_t X, size_t Y, size_t Z)
{
  size_t i, j, k;
  ...
  p[i*Y+j*Z+k] = ...;
  ...
}
...
arr2[10][20][30];
...
func_3d(&arr[0][0][0], 10, 20, 30);

From John Bode

How can I tell if a web client is blocking ads?

I'd like to get some statistics on how many people coming to my site have set their browser to block ads. Any tips on the best way to do this?

From stackoverflow Mark Harrison

I suppose you could compare the ad prints with the page views on your website (which you can get from your analytics software).

From Vaibhav
Since programs like AdBlock actually never request the advert, you would have to look the server logs to see if the same user accessed a webpage but didn't access an advert. This is assuming the advert is on the same server.

If your adverts are on a separate server, then I would suggest it's impossible to do so.

The best way to stop users from blocking adverts, is to have inline text adverts which are generated by the server and dished up inside your html.

From GateKiller
Add the user id to the request for the ad:
```
<img src="./ads/viagra.jpg?{user.id}"/>
```
that way you can check what ads are seen by which users.

From Jon Works
You need to think about the different ways that ads are blocked. The first thing to look at is whether they are running noscript, so you could add a script that would check for that.

The next thing is to see if they are blocking flash, a small movie should do that.

If you look at the adblock site, there is some indication of how it does blocking: http://adblockplus.org/en/faq_internal#elemhide

If you look further down that page, you will see that conventional chrome probing will not work, so you need to try and parse the altered DOM.

From UberAlex

AdBlock forum says this is used to detect AdBlock. After some tweaking you could use this to gather some statistics.

<script language="JavaScript" type="text/JavaScript">

setTimeout('detect_abp()', 10000);
var isFF = (navigator.userAgent.indexOf("Firefox") > -1) ? true : false;
var hasABP = false;

function detect_abp()
{
   if(isFF)
   {

         if(Components.interfaces.nsIAdblockPlus != undefined)
        {
           hasABP = true;
        }
        else
        {
          var AbpImage = document.createElement("IMG");
          AbpImage.id = 'abp_detector';
          AbpImage.src = '/textlink-ads.jpg';
            AbpImage.style.width = '0px';
          AbpImage.style.height = '0px';
          AbpImage.style.top = '-1000px';
          AbpImage.style.left = '-1000px';
          document.body.appendChild(AbpImage);
          hasABP = (document.getElementById('abp_detector').style.display == 'none');

             var e = document.getElementsByTagName("iframe");
            for (var i = 0; i < e.length; i++)
            {
               if(e[i].clientHeight == 0)
                {
                    hasABP = true;
                }
            }
              if(hasABP == true)
              {
               history.go(1);
                location = "http://www.tweaktown.com/supportus.html";
              window.location(location);       
             }
        }
      }
    }
    </script>

From Michal Sznajder

graph serialization

I'm looking for a simple algorithm to 'serialize' a directed graph. In particular I've got a set of files with interdependencies on their execution order, and I want to find the correct order at compile time. I know it must be a fairly common thing to do - compilers do it all the time - but my google-fu has been weak today. What's the 'go-to' algorithm for this?

From stackoverflow Kieron

I would expect tools that need this simply walk the tree in a depth-first manner and when they hit a leaf, just process it (e.g. compile) and remove it from the graph (or mark it as processed, and treat nodes with all leaves processed as leaves).

As long as it's a DAG, this simple stack-based walk should be trivial.

sleske : Yes, that's how you do it. It's called a depth-first search (DFS). And unless you are certain you have a DAG, you mustcheck for back edges, otherwise a cycle will send you into an infinite loop.

From Kevin Ballard
I've come up with a fairly naive recursive algorithm (pseudocode):
```
Map<Object, List<Object>> source; // map of each object to its dependency list
List<Object> dest; // destination list

function resolve(a):
    if (dest.contains(a)) return;
    foreach (b in source[a]):
        resolve(b);
    dest.add(a);

foreach (a in source):
    resolve(a);
```
The biggest problem with this is that it has no ability to detect cyclic dependencies - it can go into infinite recursion (ie stack overflow ;-p). The only way around that that I can see would be to flip the recursive algorithm into an interative one with a manual stack, and manually check the stack for repeated elements.

Anyone have something better?

From Kieron

Topological Sort. From Wikipedia:

In graph theory, a topological sort or topological ordering of a directed acyclic graph (DAG) is a linear ordering of its nodes in which each node comes before all nodes to which it has outbound edges. Every DAG has one or more topological sorts.

Pseudo code:

L ← Empty list where we put the sorted elements
Q ← Set of all nodes with no incoming edges
while Q is non-empty do
    remove a node n from Q
    insert n into L
    for each node m with an edge e from n to m do
        remove edge e from the graph
        if m has no other incoming edges then
            insert m into Q
if graph has edges then
    output error message (graph has a cycle)
else 
    output message (proposed topologically sorted order: L)

Benjol : Eh... copied directly off wikipedia?

Jason S : yes, please cite sources

From Andrew Peters

If the graph contains cycles, how can there exist allowed execution orders for your files? It seems to me that if the graph contains cycles, then you have no solution, and this is reported correctly by the above algorithm.

sleske : Yes, a topological sort is not possible if a graph contains cycles. This corresponds to the real world: If I ask you to do A before B, *and* B before A, there's no way you're gonna satisfy me ;-).

From $pageUsers[$post.author].name

Best way to cache data in .NET

I am in the process of figuring out a cache strategy for our current setup, currently have multiple web servers and wanted to know what is the best way to cache data in this environment. I have done research about MemCache and the native asp.net caching but wanted to get some feedback first. Should I go with a Linux box if I use MemCache or a win32 port of MemCache. Thanks!

From stackoverflow Omar Ramos

What about checking out Microsoft Velocity (http://code.msdn.microsoft.com/velocity)? Another option if you don't want to start using Microsoft CTP-ware is to check out Nache which allows distributed cache/session state management (http://www.alachisoft.com/ncache/)

From Jonas Follesø
http://www.danga.com/memcached/

worked awesome for me and have heard nothing but goodness about it

From Kevin Sheffield
Dare Obasanjo has a pretty good blog post about this topic. You really need to assess what it is you're caching, why you're caching it and what your needs are before you can make a decision on a caching strategy.

From lomaxx

What are the advantages of using SVN over CVS?

My company currently uses CVS as our defacto standard for source control. I've heard many people say SVN is better, and I know that it's newer, but other than that am not sure of the benefits. I should note we use primarily java and eclipse if that matters. I guess what I'm looking for is a good, succinct comparison of the 2 noting advantages and disadvantes of using both in a java eclipse development environment?

From stackoverflow shsteimer

One of the many comparisons:

http://wiki.scummvm.org/index.php/CVS_vs_SVN

Now this is very specific to that project, but a lot of stuff apllies in general.

Pro Subversion:
- Support for versioned renames/moves (impossible with CVS): Fingolfin, Ender
- Supports directories natively: It's possible to remove them, and they are versioned: Fingolfin, Ender
- File properties are versioned; no more "executable bit" hell: Fingolfin
- Overall revision number makes build versioning and regression testing much easier: Ender, Fingolfin
- Atomic commits: Fingolfin
- Intuitive (directory-based) branching and tagging: Fingolfin
- Easier hook scripts (pre/post commit, etc): SumthinWicked (I use it for Doxygen after commits)
- Prevents accidental committing of conflicted files: Salty-horse, Fingolfin
- Support for custom 'diff' command: Fingolfin
- Offline diffs, and they're instant: sev
From Michael Stum
The Subversion book has an appendix that details important differences from CVS, which may help you make your decision. The two approaches are more or less the same idea but SVN was specifically designed to fix long standing flaws in CVS so, in theory at least, SVN will always be the better choice.

From Mat
CVS only tracks modification file by file while SVN tracks a whole commit as a new revision, which means that it is easier to follow the history of your project. Add the fact that all modern source control software use the concept of revision so it is far easier to migrate from SVN than it is from CVS.

There is also the atomic commit problem. While I only encountered it once, it is possible that 2 people committing together in CVS conflicts each other loosing some data and putting your client in an inconsistent state. When detected early, these problem are not major because your data is still out there somewhere but it can be a pain in a stressful environment.

And finally, not much tools are developed around CVS anymore. While the new and shiny new tools like Git or Mercurial definitely lack tools yet, SVN has a pretty large application base on any system.

From Vincent Robert
SVN has 3 main advantages over CVS
- it's faster
- supports versioning of binary files
- and adds transactional commit (all or nothing)
From lubos hasko
You should take a look at Git instead of SVN. It's a DVCS that's blazing-fast and very powerful. It's not as user-friendly as SVN, but it's improving in that regard, and it's not that hard to learn.

From Kevin Ballard
btw: CVSNT supports atomic commits

From David Sykes
I'll second Eridius' suggestion of Git, but I'd expand it to the other DRCS (Distributed Revision Control System) such as Mercurial and bazaar.

These products are fairly recent and the level of tooling and integration with them seems low at the moment (based on my initial research). I'd say they were best suited to the power-developers out there (and on here ;-)).

On the other hand, what doesn't CVS currently do for you? From your initial question, you don't really have any, "CVS sucks at this, what could I use instead?"

You've gotta weigh up the costs of any potential migration against the benefits. For an existing project, I think that it would be hard to justify.

From Steven Dick
One thing not to overlook is ecosystem. I was working at a CVSNT shop, and I was finding more and more open source tools supported SubVersion by default.

From engtech
As someone who is in the middle of switching between CVS and SVN (initially we switched all of our projects with cvs2svn and then decided that we would transition by only using svn on new projects), here are some of the problems we have had.
- Merging and branching are very different, and if you branch and merge frequently, unless you have SVN 1.5 running on your server have to know when you branched (this isn't very clear in the Tortoise SVN dialogs). Michael says the branching and merging is intuitive, I would argue that after using CVS for 10 years, it is not.
- If your are running the SVN server on Linux, it may be hard to get your SA to move to svn 1.5, as the default install 1.4.x.
- Merging conflicts is not nearly as easy or as clear (at least to me and my co-workers) in TortoiseSVN as it is in TortoiseCVS. The three pane approach takes some getting used to and the WinMerge (my preferred merge tool) doesn't do a three pane merge.
- Beware: many of the online tutorials and magazine articles I have read obviously don't branch and merge, you should set up your main repository as https://svn.yoursvnserver.com/repos/YourProject/Trunk and branches on https://svn.yoursvnserver.com/repos/YourProject/Branches/BranchX . You can clean up if you start your repos in the wrong place, but it leads to confusion.
From Kris Erickson
you might also choose to migrate only the latest code from CVS into SVN and freeze your current CVS repo. this will make migration easier and you might also build your legacy releases in the old CVS repo.

From webwesen

What are the correct pixel dimensions for an apple-touch-icon?

I'm not sure what the correct size should be.

Many sites seem to repeat that the apple-touch-icon should be 57x57 pixels but cite a broken link as their source.

Hanselman's and playgroundblues's comments suggest different sizes including 163x163 and 60x60.

Apple's own apple.com icon is 129x129!

See my related question: How do I give my web sites an icon for iPhone?

From stackoverflow Zack Peterson

I don't think there is a "correct size". Since the iPhone really is running OSX, the icon rendering system is pretty robust. As long as you give it a high-quality image with the right aspect ratio and a resolution at least as high as the actual output will be, the OS will downscale very cleanly. My site uses a 158x158 and the icon looks pixel-perfect on the iPhone screen.

From Rex M
Depends on how much detail you want it to have, it needs to have the aspect ratio of 1:1 (basically - it needs to be square)

I would go with the Apple's own 129*129

From saniul
The official size is 57x57. I would recommend using the exact size simply due to the fact that it takes less memory when loaded (unless Apple caches the scaled representation). With that said, Rex is right that any square size will work

From NilObject
Can you cite a source for that size?

Sure, although the official reference is under lock and key of ADC (Google cache still has it), many of the non-NDA sites have the tutorials on how to create the icons. One example is here:

DAN DICKINSON: THE PRIMARY VIVID WEBLOG

From NilObject
From the Google cache of Apple Developer Connection - Web Apps Dev Center - Designing Content...

Create a Web Clip Bookmark Icon

iPhone and iPod touch allow a user to save a Web Clip bookmark to your site on their Home Screen.

To specify a bookmark icon for all pages of a web site, place a PNG image named "apple-touch-icon.png" at the root directory of your web server - similar to the "favicon.ico" for site icons.

To override the site bookmark icon on a specific web page, insert a <link> element similar to <link rel="apple-touch-icon" href="/customIcon.png"/> within the element of the page.

The bookmark icon dimensions should be 57x57 pixels. If the icon is a different size it will be scaled and cropped to fit.

Safari will automatically composite the icon with the standard "glassy" overlay so it looks like a built-in iPhone or iPod application.

Georg : Your answer is outdated. See http://stackoverflow.com/questions/1598/what-are-the-correct-pixel-dimensions-for-an-apple-touch-icon/3490980#3490980

From Zack Peterson
NilObject's link led me to the great blog post Catchup on your Icon at makentosh.com

... Of course all of this inconsistency had to be dealt with eventually, right? Well 2.0 dealt with it in fine fashion! Finally 57x57 actually meant 57x57.

...each and every pixel ... rendered perfectly.

From Zack Peterson
Hello, it seems that Apple guidelines as of August 3, 2010 now include the "High resolution" images (for iPhone 4) in their "required" icon sizes. Looks like we need to provide both a 57x57 and a 114x114 image now, as well as a 640x960 title image.

http://developer.apple.com/iphone/library/documentation/userexperience/conceptual/mobilehig/IconsImages/IconsImages.html#//apple_ref/doc/uid/TP40006556-CH14-SW2

From OtomoBill

Using object property as default for method property

I'm trying to do this (which produces an unexpected T_VARIABLE error):

public function createShipment($startZip, $endZip, $weight = $this->getDefaultWeight()){}

I don't want to put a magic number in there for weight, since the object I am using has a "defaultWeight" parameter that all new shipments get if you don't specify a weight. I can't put the defaultWeight in the shipment itself, because it changes from shipment group to shipment group. Is there a better way to do it than the following?

public function createShipment($startZip, $endZip, weight = 0){
    if($weight <= 0){
        $weight = $this->getDefaultWeight();
    }
}

From stackoverflow cmcculloh

This isn't much better:

public function createShipment($startZip, $endZip, $weight=null){
    $weight = !$weight ? $this->getDefaultWeight() : $weight;
}

// or...

public function createShipment($startZip, $endZip, $weight=null){
    if ( !$weight )
        $weight = $this->getDefaultWeight();
}

From Kevin

This will allow you to pass a weight of 0 and still work properly. Notice the === operator, this checks to see if weight matches "null" in both value and type (as opposed to ==, which is just value, so 0 == null == false).

PHP:
```
public function createShipment($startZip, $endZip, $weight=null){
    if ($weight === null)
        $weight = $this->getDefaultWeight();
}
```
From pix0r
@pix0r

That is a good point, however, if you look at the original code if the weight is passed as 0 it uses the default weight.

From Kevin

try using a static class member.

class Shipment
{
    public static $DefaultWeight = '0';
    public function createShipment($startZip,$endZip,$weight=Shipment::DefaultWeight){
    //you function
    }
}

From paan

Neat trick with boolean OR operator:

public function createShipment($startZip, $endZip, $weight = 0){
    $weight or $weight = $this->getDefaultWeight();
    ...
}

From Michał Rudnicki

Are there any conversion tools for porting Visual J# code to C#?

From stackoverflow Kevin Dente

I've used the JLCA (Java Language Conversion Assistant) from Microsoft before. Worked well, but the last release was in 2004.

From Jon Galloway
Have you tried Reflector + FileGenerator to dump the compiled .NET library to C#. The only problem that I can see is that you are going to have a slight amount of clean up for J# specific references and no comments. But it should provide a good clean solution and get you 90% of the way.

From Nick Berardi
Nick mentioned Reflector + FileGenerator. That will kind of work. I've tried it and it works partially; but you still have to sift through the code and fix compiler errors, as Reflector doesn't do a perfect job.

JLCA is dead, last I heard. And if your project is J# (e.g. part .NET) it will stumble on .NET things. For example, if you've got any Windows Forms stuff in your J# project, JLCA will puke on it.

Overall, my suggestion would be use Reflector + FileGenerator, fix the compiler errors, and never look back at J#. :-)

Another interesting alternative is Mono's IKVM, which can run real Java on top of .NET. This would work if your J# code is all Java stuff and no .NET stuff or MS-specific Java.

Kevin Driedger : It's not really a Mono project so I would put the IKVM webpage: http://www.ikvm.net/

From Judah Himango

Linking two Office documents

Problem:

I have two spreadsheets that each serve different purposes but contain one particular piece of data that needs to be the same in both spreadsheets. This piece of data (one of the columns) gets updated in spreadsheet A but needs to also be updated in spreadsheet B.

Goal:

A solution that would somehow link these two spreadsheets together (keep in mind that they exist on two separate LAN shares on the network) so that when A is updated, B is automatically updated for the corresponding record.

*Note that I understand fully that a database would probably be a better plan for tasks such as these but unfortunately I have no say in that matter.

**Note also that this needs to work for Office 2003 and Office 2007

From stackoverflow Justin Bennett

So you mean that AD743 on spreadsheet B must be equal to AD743 on spreadsheet A? Try this:
- Open both spreadsheets on the same machine.
- Go to AD743 on spreadsheet B.
- Type =.
- Go to spreadsheed A and click on AD743.
- Press enter.
You'll notice that the formula is something like '[path-to-file+file-name].worksheet-name!AD743'.

The value on spreadsheet B will be updated when you open it. In fact, it will ask you if you want to update. Of course, your connection must be up and running for it to update. Also, you can't change the name or the path of spreadsheet A.

From Mario Marinato -br-
I can't say if this is overkill without knowing the details of your usage case, but consider creating a spreadsheet C to hold all data held in common between the two. Links can become dizzyingly complex as spreadsheets age, and having a shared data source might help clear up the confusion.

Perhaps even more "enterprise-y" is the concept of just pasting in all data that otherwise would be shared. That is the official best practice in my company, because external links have caused so much trouble with maintainability. It may seem cumbersome at first, but I've found it may just be the best way to promote maintainability in addition to ease of use, assuming you don't mind the manual intervention.

From saint_groceon

How do I know which SQL Server 2005 index recommendations to implement, if any?

We're in the process of upgrading one of our SQL Server instances from 2000 to 2005. I installed the performance dashboard (http://www.microsoft.com/downloads/details.aspx?FamilyId=1d3a4a0d-7e0c-4730-8204-e419218c1efc&displaylang=en) for access to some high level reporting. One of the reports shows missing (recommended) indexes. I think it's based on some system view that is maintained by the query optimizer.

My question is what is the best way to determine when to take an index recommendation. I know that it doesn't make sense to apply all of the optimizer's suggestions. I see a lot of advice that basically says to try the index and to keep it if performance improves and to drop it if performances degrades or stays the same. I wondering if there is a better way to make the decision and what best practices exist on this subject.

From stackoverflow Paul G

Your best researching the most common type of queries that happen on your database and creating indexes based on that research.

For example, if there is a table which stores website hits, which is written to very very often but hardly even read from. Then don't index the table in away.

If how ever you have a list of users which is access more often than is written to, then I would firstly create a clustered index on the column that is access the most, usually the primary key. I would then create an index on commonly search columns, and those which are use in order by clauses.

From GateKiller
The advice you got is right. Try them all, one by one.

There is NO substitute for testing when it comes to performance. Unless you prove it, you haven't done anything.

From Stu
First thing to be aware of:

When you upgrade from 2000 to 2005 (by using detach and attach) make sure that you:
A) set compability to 90
B) rebuild the indexes
c) run update statistics with full scan

If you don't do this you will get suboptimal plans

IF the table is mostly write you want as few indexes as possible IF the table is used for a lot of read queries you have to make sure that the WHERE clause is covered by indexes

From SQLMenace

What is the difference between all the different types of version control?

After being told by at least 10 people on SO that version control was a good thing even if it's just me I now have a followup question.

What is the difference between all the different types of version control and is there a guide that anybody knows of for version control that's very simple and easy to understand?

From stackoverflow Teifion

Version Control is essential to development, even if you're working by yourself because it protects you from yourself. If you make a mistake, it's a simple matter to rollback to a previous version of your code that you know works. This also frees you to explore and experiment with your code because you're free of having to worry about whether what you're doing is reversible or not. There are two major branches of Version Control Systems (VCS), Centralized and Distributed.

Centralized VCS are based on using a central server, where everyone "checks out" a project, works on it, and "commits" their changes back to the server for anybody else to use. The major Centralized VCS are CVS and SVN. Both have been heavily criticized because "merging" "branches" is extremely painful with them. [TODO: write explanation on what branches are and why merging is hard with CVS & SVN]

Distributed VCS let everyone have their own server, where you can "pull" changes from other people and "push" changes to a server. The most common Distributed VCS are Git and Mercurial. [TODO: write more on Distributed VCS]

If you're working on a project I heavily recommend using a distributed VCS. I recommend Git because it's blazingly fast, but is has been criticized as being too hard to use. If you don't mind using a commercial product BitKeeper is supposedly easy to use.

From num1
I would start with:
- A Visual Guide to Version Control
- Wikipedia
Then once you have read up on it, download and install SVN, TortoiseSVN and skim the first few chapters of the book and get started.

From Yaakov Ellis
Eric Sink has a good overview of source control. There are also some existing questions here on SO.

From Eric Haskins
Another recent question on the same topic

From Yaakov Ellis
We seem to be in the golden age of version control, with a ton of choices, all of which have their pros and cons.

Here are the ones I see most used:
- svn - currently the most popular open source?
- git - very hot since Linus switched to it
- mercurial - some smart people I know swear by it
- cvs - the one everybody is switching from
- perforce - imho, the best features, but it's not open source. The two-user license is free, though.
- visual sourcesafe - I'm not much in the Microsoft world, so I have no idea about this one, other than people like to rag on it as they rag on everything from Microsoft.
- sccs - for historical interest we mention this, the great-grandaddy of many of the above
- rcs - and the grandaddy of many of the above
My recommendation: you're safest with either git, svn or perforce, since a lot of people use them, they are cross platform, have good guis, you can buy books about them, etc.

Dont consider cvs, sccs, rcs, they are antique.

The nice thing is that, since your projects will be relatively small, you will be able to move your code to a new system once you're more experienced and decide you want to work with another system.

From Mark Harrison
If you are working by yourself in a Windows environment, then the single user license for SourceGear's Vault is free.

From Chris Miller
The answer to another question also applies here, most importantly

Jon Works said:
The most important thing about version control is:

JUST START USING IT

His answer goes into more detail, and I don't want to be accused of plaigerism so take a look.

From Peter Coulton
To everyone just starting using version control:

Please do not use git (or hg or bzr) because of the hype

Use git (or hg or bzr) because they are better tools for managing source code than SVN.

I used SVN for a few years at work, and switched over to git 6 months ago. Without learning SVN first I would be totaly lost when it comes to using a DVCS.

For people just starting out with version control:
- Start by downloading SVN
- Learn why you need version control
- Learn how to commit, checkout, branch
- Learn why merging in SVN is such a pain
Then switch over to a DVCS and learn:
- How to clone/branch/commit
- How easy it is to merge your branches back (go branch crazy!)
- How easy it is to rewrite commit history and keep your branches
  up to date with the main line (git rebase -i, )
- How to publish your changes so others can benefit
tldr; crowd:

Start with SVN and learn the basics, then graduate to a DVCS.

Will Robertson : I don't get this whole "DVCS is harder" meme. In their basic incarnations, hg and bzr are at least as easy to use as svn; in fact, I'd posit that it's *easier* to run `git init` rather than to set up an svn server somehow.

Spoike : @Will Robertson: I agree, I never found DVCS more difficult than CVCS. On the contrary, it's been easier to grok. E.g. It's much easier to set up a repo in git while svn has a multiple step process just to follow the convention trunk/branches/tags even if it is locally on your own computer.

Don Kirkby : Hi @Will and @Spoike, I can't say for sure if distributed version control is harder to learn for a new user than centralized is. However, I can compare my transitions from Source Safe to CVS and from there to SVN with my transition from SVN to Bazaar. When you move from one centralized tool to another, they have mostly the same features with a different (hopefully improved) interface. When you move to a distributed tool, there is a fundamental shift in how the repository is used and what your work flow looks like. I found that harder to wrap my head around.

From Jon Works
We use and like Mercurial. It follows a distributed model - it eliminates some of the sense of having to "check in" work. Mozilla has moved to Mercurial, which is a good sign that it's not going to go away any time soon. One con, in my opinion, is that there isn't a very good GUI for it. If you're comfortable with the command line, though, it's pretty handy.

Mercurial Documentation Unofficial Manual

Martin Geisler : The OpenJDK project has also switched and the Python language is in the process of switching to Mercurial.

From pc1oad1etter
The simple answer is, do you like Undo buttons? The answer is of course yes, because we as human being make mistakes all the time.

As programmers, its often the case though that it can take several hours of testing, code changes, overwrites, deletions, file moves and renames before we work out the method we are trying to use to fix a problem is entirely the wrong one and the code is more broken than when we started.

As such, Source Control is a massive Undo button to revert the code to an earlier time when the grass was green and the food plentiful. And not only that, because of how source control works, you can still keep a copy of your broken code, in case a few weeks down the line you want to refer to it again and cherry pick any good ideas that did come out of it.

I personally (though it could be called overkill) use a free Single user license version of Source Gear Fortress (which is their Vault source control product with bug tracking features). I find the UI really simple to use, it supports both the checkout > edit > checkin model and the edit > merge > commit model. It can be a little tricky to set up though, requiring you to run a local copy of ISS and SQL server. You might want to try a smaller program, like those recommended by other answers here. See what you like and what you can afford.

From Nidonocu
Mark said:

git - very hot since Linus switched to it

I just want to point out that Linus didn't switch to it, Linus wrote it.

Craig McQueen : well, he switched from using Bitkeeper to writing and using git.

From Kevin Ballard
Just start using source control, no matter what type you use. What you use doesn't matter; it's the use of it that is important

From Espenhh
Like everyone else, SC is really dependant on your needs, your budget, your environment, etc.

At its root, source control is designed to provide a central repository of all your code, and track who did what to it when. There should be a complete history, and you can get products that do full changelogs, auditing, access control, and on and on...

Each product that is out there starts to shine (so to speak) when you start to look at how you want or need to incorporate SC into your environment (whether it's your personal code and documents or a large corporations). And as people use them, they discover that the tool has limitations, so people write new ones. SVN was born out of limitations that the creators saw with CVS. Linus wanted something better for the Linux kernel, so now we have git.

I would say start using one (something like SVN which is very popular and pretty easy to use) and see how it goes. As time progresses you may find that you need some other functionality, or need to interface with other systems, so you may need SourceSafe or another tool.

Source control is always important, and while you can get away with manually re-numbering versions of PSD files or something as you work on them, you're going to forget to run that batch script once or twice, or likely forget which number went with which change. That's where most of these SC tools can help (as long as you check-in/check-out).

From Milner
See also this SO question:
- Difference between GIT and CVS
From Jakub Narębski

Thursday, February 3, 2011

Blog Archive