2

I have purchasing history data for grocery shopping. I am trying to get abnormally frequently purchased items under certain conditions. For instance, I am trying to find frequently purchased items, when customers shop online and are willing to pay an extra shipping fee.

In order to find items that are particularly (or abnormally) frequently purchased under that situation (through online stores by paying shipping fee), how and what Machine Learning Algorithm should I apply and identify those items?

I found arules R package which is using the association rules with purchasing history and tried to apply it. But it seems the package might be based on different principle from my idea.

Anyone has an idea about my problem? If there is an R package related to the problem, it would be perfect.

John legend2
  • 141
  • 2
  • This is quite a complex problems requiring very thorough data if you want to predict a new customers behavior..No satisfactory algorithm exists and it is a very hot research field. There are a few approaches but most of them I think you have to manually write the code. –  Feb 13 '18 at 16:56
  • @DuttaA: the concrete question "find items that are particularly frequently purchased through online stores by paying shipping fee" doesn't seem complex, eve doesn't seems to need applied AI. About prediction of customer preferences, all big online shops has it. See by example Amazon's "customer who buy this item frequently ..." – pasaba por aqui Feb 13 '18 at 17:01
  • Your could use clustering techniques such as DBSCAN to identify tendencies (clusters) in buying frequency. I'm not familiar with R, but [there's a package for that](https://cran.r-project.org/web/packages/dbscan/README.html). – lamontap Feb 14 '18 at 13:48
  • @pasabaporaqui i think OP did not frame the question correctly and he is talking basically of a recommend-er system. Statistical approach for prediction will not extend to new users easily. –  Feb 15 '18 at 11:40

1 Answers1

1

Let start by the concrete question, and follow talking about the general problem.

a)

The concrete question "find items that are particularly frequently purchased through online stores by paying shipping fee" needs few or none usage of applied AI, just a few of statistics.

The question talks about "item purchased" and "buy method", thus, we have an information database with entries like:

sale(online,item1).

sale(shop,item2).

sale(online,item2).

sale(online,item2).

...

(note records can be repeated)

the percentage of online sales of item "X" is defined as the fraction of sales online of this item, sale(online,X), over the total sales of this item, sale(_,X):

enter image description here

In the previous data examples, p(item1)=1/1, p(item2)=2/3.

High p(X) means items that are preferred for online shopping.

Other probabilities can be defined for similar cases.

b)

About the general case, we are talking about data mining. The are very good packages (open source, ...) for them: weka, IBM DWE, ... . By example, using Weka J48 over a database defined as:

sale( purchase_identifier, buy method, item )

where "purchase_identifier" must group item that has been purchased in a single buyout (cash ticket). J48 will then provide as answer rules as: item "foo" is usually purchased in online shop when also item "bar" is purchased.

pasaba por aqui
  • 1,282
  • 6
  • 21