获取项目列表的更好方法：缓存序列化数据与数据库查询或其他？

I have made a lot of searches about caching data in files (serialize/unserialise vs json_encode/decode, var_export, igbonary) and mysql queries (optimizations, stored procedures, query cache), but at this moment, I wonder what is the better way to optimize a concrete case like the following.

我已经做了很多关于缓存文件中数据的搜索(serialize / unserialise vs json_encode / decode,var_export,igbonary)和mysql查询(优化,存储过程,查询缓存),但此时此刻,我想知道更好的方法是什么优化具体案例,如下所示。

Sorry in advance : this is a long topic for a small answer I guess, but it is necessary to understand the project. And excuse my poor english, which is not my first language.

提前抱歉:我想这是一个很小的答案的长话题,但是有必要了解这个项目。请原谅我可怜的英语,这不是我的第一语言。

Let's imagine that we have this database relationships.

让我们假设我们有这种数据库关系。

Description of the database (estimated number of records in parentheses) :

数据库的描述(括号中估计的记录数):

MODULE (10) : is the type of Item, could be article, forum topic, ad, news...

MODULE(10):是Item的类型,可以是文章,论坛主题,广告,新闻......

ITEM (millions) : any type with a title and some text

ITEM(数百万):任何带有标题和文本的类型

CATEGORY (50) : items categories (animals, politic, cars, computers...)

类别(50):项目类别(动物,政治,汽车,电脑......)

TAG (hundreds of thousands): category's tags (ex. for politic : Internationnal, France, Barack Obama...)

标签(数十万):类别标签(例如政治:国际,法国,巴拉克奥巴马......)

ITEM_TAG (outch) : items and tags associations

ITEM_TAG(outch):项目和标签关联

So we have several relationships, and each is recorder at the ITEM creation/update.

所以我们有几个关系,每个关系都是ITEM创建/更新的记录器。

I have already cached ITEMs data in folders and files with the following example :

我已经使用以下示例缓存了文件夹和文件中的ITEM数据:

public function cacheItem()
{
    $req=mysql_query("SELECT id, title, content, id_mod, id_cat
            FROM ITEM
            WHERE ITEM.id='".$this->id."'")or die(mysql_error());
    if(mysql_num_rows($req)==1)
    {
        $this->itemData=mysql_fetch_array($req);
        $this->folder=floor($this->id/1000);//1000 items max per folder
        $this->itemUrl=$this->folder."/".$this->id.".txt";                      
        if(!file_exists($this->itemUrl))
        {
            touch($this->itemUrl);
        }
        file_put_contents($this->itemUrl,serialize($this->itemData),LOCK_EX);
    }
}

And I get them by an unserialize(file_get_contents($url)), this part works like a charm !

我通过反序列化(file_get_contents($ url))得到它们,这部分就像一个魅力!

Now I wish to optimize the lists of ITEMs to display them by several options (for example), foreach display with a limit of 100 per pagination :

现在我希望优化ITEM列表以通过几个选项(例如)显示它们,foreach显示每个分页限制为100:

ALL ITEMs
ITEMs of a MODULE

模块的项目

ITEMs of a CATEGORY

类别的项目

ITEMs of a CATEGORY and a MODULE

CATEGORY和MODULE的项目

ITEMs of a TAG

标签的项目

ITEMs of a TAG and a CATEGORY

TAG和CATEGORY的项目

ITEMs of a TAG and a CATEGORY and a MODULE

TAG,CATEGORY和MODULE的项目

I already know how to do this in SQL and to put the results in a cache tree.

我已经知道如何在SQL中执行此操作并将结果放在缓存树中。

My problem, with those cache files, is that when a new ITEM is created/updated, the list may have to be refreshed with a lot of strictness.

使用这些缓存文件的问题是,当创建/更新新的ITEM时,可能必须以非常严格的方式刷新列表。

First question :

第一个问题:

So what will happen if ITEMs are created/updated (so those lists too) at the same time ?

那么,如果同时创建/更新ITEM(那些列表也是如此)会发生什么?

Does the LOCK_EX of the file_put_contents(); will do his job while getting files from file_get_contents(); ?

是file_put_contents()的LOCK_EX吗?将从file_get_contents()获取文件时完成其工作; ?

Second question

I understand that more PHP will work, less mySQL will (and the otherwise), but what is the better (faster to display) way to do those lists with pagination, which will be displayed each second or more, and only modified by adding/updating a new ITEM ?

我知道更多的PHP可以工作,而不是mySQL(和其他方面),但是更好(显示更快)的方式来做这些带分页的列表,每秒钟或更多时间显示,并且只能通过添加/来修改更新新的ITEM?

My cache system (I don't think so...)

我的缓存系统(我不这么认为......)

Stored procedures in mySQL

mySQL中的存储过程

Several database servers and/or several files servers

多个数据库服务器和/或多个文件服务器

Other

Any ideas, examples, links greatly appreciated.

任何想法,例子,链接都非常感谢。

P.S. : just for fun I may ask "how does Facebook" and "how does stackoverflow" ?

附: :只是为了好玩,我可能会问“Facebook怎么样”和“stackoverflow怎么样”?

1 个解决方案

#1

First question:

Your operations should be fine with LOCK_EX. The files may get locked if accessed simultaneously which will definitely slow things down, but all operations should complete correctly. However, this is a good example why you should not implement your own cache system.

使用LOCK_EX,您的操作应该没问题。如果同时访问文件可能会被锁定,这肯定会减慢速度,但所有操作都应该正确完成。但是,这是一个很好的示例,为什么您不应该实现自己的缓存系统。

Second question:

MySQL will definitely be faster than your cache system (Unless you do some seriously wicket coding and not in PHP). Databases like MySQL have done a lot of work in optimizing their performance.

MySQL肯定会比你的缓存系统更快(除非你做一些严重的wicket编码而不是PHP)。 MySQL等数据库在优化性能方面做了大量工作。

I don't believe that stored procedures in MySQL will offer you any real benefit in the examples provided above over plain old SELECT queries.

我不相信MySQL中的存储过程会在上面提供的示例中为普通的旧SELECT查询提供任何真正的好处。

Using a NoSQL approach like MongoDB can help you if you use sharding on a server cluster. This is more difficult to write and more servers cost more money. Also, it is not clear from your question if moving to a different database system is an option.

如果在服务器群集上使用分片,使用像MongoDB这样的NoSQL方法可以帮助您。这更难写,更多的服务器需要更多的钱。此外,从您的问题中不清楚是否可以选择移动到其他数据库系统。

If you stick with MySQL, it is probably easier to implement load balancing application servers than a database server cluster. With this in mind, more work done by PHP is preferred to more work in MySQL. I would not follow this approach though, because you are giving up much for only a small benefit.

如果您坚持使用MySQL,那么实现负载平衡应用程序服务器可能比数据库服务器集群更容易。考虑到这一点,PHP完成的更多工作比MySQL中的更多工作更受青睐。我不会遵循这种方法,因为你只是为了一小笔利益而放弃了很多。

In short, I recommend that you stick to plain SELECT queries to get what you need. Run your application and database on separate servers, and use the more powerful server for your DB server.

简而言之,我建议您坚持使用普通的SELECT查询来获得所需的内容。在不同的服务器上运行您的应用程序和数据库,并为数据库服务器使用功能更强大的服

PS. Facebook write a pre-compiler for PHP to make their code run faster. In my opinion, PHP is not a very fast language and you can get better results from Python or Node.js.

PS。 Facebook为PHP编写了一个预编译器,以使其代码运行得更快。在我看来,PHP不是一种非常快速的语言,你可以从Python或Node.js获得更好的结果。

Stackoverflow use ASP.NET MVC with MS SQL Server. They have a single big powerful server for the database and apparently rather use DB queries where they can. They also use load balanced application servers that are separate from their DB server.

Stackoverflow使用ASP.NET MVC和MS SQL Server。他们有一个强大的数据库强大的服务器,显然他们可以尽可能地使用数据库查询。它们还使用与其数据库服务器分开的负载平衡应用程序服务器。

1 个解决方案

#1

更多相关文章

随机推荐