App Store Statistical Analysis

I always wonder how apple order apps when you search a keyword. It makes sense that apple search apps by downloads or income, but is there anything else?

I was playing around with iTunes API and wrote a small tool with it (as mentioned here). Now, created some features (as defined in Machine Learning) and ran a T-Test to realize which feature has a correlation with ranking. By this test, the goal was to verify the hypothesis if these features are driving app ranking or not:

 

Features:

  • is app universal (iPhone and iPad)
  • Minimum Age permitted to use the app (Rating)
  • Rating count for current version
  • Release Date (number of seconds)
  • Size of app
  • Number of languages supported by app
  • Release date of current version
  • Current version Rating count
  • Total rating count
  • Average rating for current app (5 stars, 4 stars, …)
  • Average rating total
  • Minimum version of iOS supported

 

Then I ran the t-test analysis on the data for 200 apps for the search result of “slideshow”. Here are the test results:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.815
Model:                            OLS   Adj. R-squared:                  0.804
Method:                 Least Squares   F-statistic:                     75.60
Date:                Sat, 26 Nov 2016   Prob (F-statistic):           4.15e-63
Time:                        23:03:01   Log-Likelihood:                -1064.2
No. Observations:                 200   AIC:                             2150.
Df Residuals:                     189   BIC:                             2187.
Df Model:                          11                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1            -4.8211      8.774     -0.549      0.583       -22.129    12.487
x2            -1.1637      1.725     -0.675      0.501        -4.566     2.239
x3            -0.0048      0.004     -1.214      0.226        -0.013     0.003
x4         -9.449e-08   7.26e-08     -1.302      0.195     -2.38e-07  4.87e-08
x5         -2.732e-08   6.49e-08     -0.421      0.674     -1.55e-07  1.01e-07
x6            -1.0004      0.644     -1.554      0.122        -2.270     0.269
x7          2.485e-07   7.35e-08      3.378      0.001      1.03e-07  3.94e-07
x8            -0.0009      0.000     -2.464      0.015        -0.002    -0.000
x9            -6.4069      3.369     -1.901      0.059       -13.053     0.240
x10           -5.0835      3.873     -1.313      0.191       -12.724     2.557
x11          -10.5053      3.112     -3.376      0.001       -16.643    -4.367
==============================================================================
Omnibus:                        9.896   Durbin-Watson:                   0.563
Prob(Omnibus):                  0.007   Jarque-Bera (JB):                4.634
Skew:                          -0.094   Prob(JB):                       0.0985
Kurtosis:                       2.279   Cond. No.                     4.92e+09
==============================================================================

As highlighted in the table above, most of the features are not passing the t-test. But only four of them do pass the t-test:

  • Release date of current version
  • Total rating count
  • Average rating for current version (5 stars, 4 stars, …)
  • Minimum version of iOS supported

By running the t-test for only these four features:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.807
Model:                            OLS   Adj. R-squared:                  0.803
Method:                 Least Squares   F-statistic:                     204.4
Date:                Sat, 26 Nov 2016   Prob (F-statistic):           9.47e-69
Time:                        23:07:07   Log-Likelihood:                -1068.5
No. Observations:                 200   AIC:                             2145.
Df Residuals:                     196   BIC:                             2158.
Df Model:                           4                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
x1          1.516e-07   1.41e-08     10.744      0.000      1.24e-07  1.79e-07
x2            -0.0011      0.000     -3.541      0.000        -0.002    -0.001
x3           -10.3560      2.382     -4.347      0.000       -15.054    -5.658
x4           -12.0864      2.766     -4.370      0.000       -17.541    -6.631
==============================================================================
Omnibus:                       22.168   Durbin-Watson:                   0.479
Prob(Omnibus):                  0.000   Jarque-Bera (JB):                7.048
Skew:                          -0.088   Prob(JB):                       0.0295
Kurtosis:                       2.097   Cond. No.                     1.11e+09
==============================================================================

 

As you can see in the table above, all these four features are significant in the app rating score. That means the app with following features would show up higher:

  • The app that has the most recent update shows up higher
  • The app that has the most ratings for all versions
  • The app that rated higher (e.g. 5 stars) for the current version available
  • The app that supports latest iOS

 

The script is available in the GitHub.

Hope that helps 🙂

Advertisements

ASO: App Store Optimization

I have heard ASO before, and I never believed it. This story is our experience in SmartStory and how little things can make huge differences.

For a while, we only had downloads from France and Germany. Part of that was because we had strong localization and listed Germany and French as our supported languages, but how come we did not have any downloads from Asia Pacific? or the United States?

Desperately, I looked up ASO again, and the old school ways of optimizing keywords. Unfortunately, all the websites I knew for that manner are commercial and for a cheap entrepreneur that was out of reach. With little more search, I found iTunes API. Great tool. This API would let you search the app store and find out where you’re at.

My gutt feeling was people are searching for “Slide show” to find out about apps like SmartStory. I was very wrong, but even with that assumption, I wrote a python script to search US, Canada, French, Germany, Japnese, Korean and Chinese app stores to see where is SmartStory. I uploaded my tool to github as a public too. I still need to write a fancy README, but I figured I might better share it now.

The tool response was mind blowing. App rank in different app stores:

  • US: 200+ (iTunes API only returns 200 results, so 200+ basically means unlisted)
  • FR: 89
  • DE: 119
  • KR: 152
  • CN: 200+
  • ES: 181
  • MX: 200+
  • CA: 200+

 

No wonder all downloads are coming from France and Germany. Basically, people in everywhere else can not see the app at all.

What we should do?

With more searching and more reading, it turned out we should have used Google Keyword Research Tool. I entered “slide show”, and I got this:

google_search_tool

I was dead wrong. People are not searching for “slide show”, people are searching for “video maker”, “slideshow maker” and “slideshow with music”. Granted this is a web tool, but the same dillema works on App Store as well.

Apple app store has a pretty simple logic (according to here). If you have “slide, show” in your keywords, it means you are going to miss “slideshow”. They are not as smart as Google to figure these things out.

 

I have also wrote another python script that parsed all 200 apps in app store and count the words in them (using a simple MySQL database). One day, I will make a nice web tool around it, but right now it is a hackish command line script. Contributions are welcome 🙂

I used Google tool and my script results to reengineer and re-enter all of the keywords and I have also renamed the app (because I ran out of space). The results was great.

Checking app store for slideshow+with+music in us: 67
Checking app store for slideshow+with+music in fr: 33
Checking app store for slideshow+maker in fr: 59
Checking app store for slideshow+with+music in de: 36
Checking app store for slideshow+with+music in kr: 41
Checking app store for slideshow+with+music in cn: 38
Checking app store for slideshow+with+music in ca: 42

Of course, there is still a lot to be done, but how hard is updating keywords, right? And did I mention it is free?

Now that the app is visible, there is a higher chance that people could download it and possibly love it. I have a lot of work to do, I need to reduce those numbers to single digits . . .

First-time user experience

In the mobile app world, you only have few seconds to convince user she should keep the app on her phone. If you fail, she would not hesitate to uninstall …

For this reason, we added a very small and short Tutorial. We realized almost half of the users, open the app and close it right away, without pressing any button. Our FTUE is simple, just show an arrow to the button they are supposed to touch for the experience to start:

screen-shot-2016-11-21-at-8-54-41-pm

Adding this was really simple and take less than an hour. However, we are not seeing people closing the app right away and also, we are seeing better retention numbers.

 

Before:

retention_before

After:

retention_after

 

We are happy to see such high impact in our retention impact after adding such a simple feature 🙂

 

 

 

Localization

In time of optimizing app, we could easily use Google Translate to automatically translate app to different languages.

The next question was which languages to pick for localization. My feeling was to use English, Spanish, French and German. I wanted Persian because it’s my own language, so that’s the first five.

 

However, after some research, I found the following image from here.

Poster-The-Global-Mobile-Games-297x841mm_b

 

This shows that the income from Asia Pacific is 2 times more than North America income (US and Canada combined). So, we have used following languages for our 1.2 version:

  • English
  • French
  • German
  • Spanish
  • Persian
  • Korean
  • Simplified Chinese
  • Japanese

 

For having all those languages, we used a Google Spreadsheet and Google Translate function: GOOGLETRANSLATE. For example, for translating “keyword” to Spanish:

 GOOGLETRANSLATE(B2,"en","es")

 

This helped us find kickstart some downloads. Surprisingly, we have more downloads from French App Store:

localization

After, some scripting which is subject for different post, we have realized We ranked higher in France App Store:

  • Checking app store for us: 200+
  • Checking app store for fr: 89
  • Checking app store for de: 119
  • Checking app store for jp: 200+
  • Checking app store for kr: 152
  • Checking app store for cn: 200+
  • Checking app store for es: 181
  • Checking app store for mx: 200+
  • Checking app store for ca: 200+

So, the fact that we have customized languages, helped us to be ranked higher in French, German, Korean and Spanish app stores.

 

SmartStory

SmartStory has been created as an idea from my wife. She has PhD and she wanted to improve the current Story Teller apps and use some smartness.

 

Current Apps like Animato or SlideShow creators are blindly add some photos and create an SlideShow, however, we want to use Machine Learning and Deep Learning to automatically pick the best pictures, and automatically add a title for each photo. Then, using a hot music that maches the mood of photos, we want to tell the story behind those pictures.

 

At the time of writing this post, there three versions out of this app. In the blog, I am going to write the improvements and story behind SmartStory.

 

Our Facebook page: https://www.facebook.com/smartstoryapp/

Our website: http://smartstory.ipronto.net/