MENU
カタログクリップ
本ページはプロモーションを含みます。

How to avoid the “0” discovered URLs bug in Search Console

2021 3/02
広告
wordpress
2021年2月27日2021年3月2日

“0” discovered URLs problem on loading the sitemaps will occur under certain conditions, which seems to be a bug in Google Search Console. I will show you how to avoid the bug and successfully fetch the sitemap.

目次

Detailed situation of 0 discovered URLs

In the “Sitemaps” page of the Google Search Console, “Total discovered URLs” remains 0 even after waiting for several months, as shown below.

I met with the problem of “0”detected URL, when I created a sitemap using Google XML Sitemaps (or XML Sitemap Generator for WordPress) 4.1.1, which is a popular plugin in WordPress. This “sitemap.xml” is a sitemap index file, and no sub-sitemaps are read. I note that The ownership of the website was verified by the domain authentication method. 

 

When I tried to register a sub-sitemap file, such as “sitemap-pt-post-2021-02.xml”, directly, Search Console returned “Couldn’t fetch” message and didn’t read it.

 

The details of the error were “Sitemap could not be read”.

So we can presume that the cause of this “0” discovered URLs is in the sub-sitemaps.

Why couldn’t sub-sitemaps be read

I made trials and errors to find out why the sub-sitemap couldn’t be read. Then I found the following:

  • When the file name of the unread sub-sitemap was changed, the sub-sitemap was read successfully.
  • When a soft link (another file name) was made on the unread sub-sitemap, the sub-sitemap was read.

It has been confirmed that the http headers are the same between the files. There is a file name dependency between readable and unreadable sitemaps. So we can say that the “0” discovered URLs problem is caused by a bug in the Search Console, but by a user misconfiguration (at least in my situation).

 

Next, I investigated which file name could be read and which file name could not be read.

Below, “o” is an ok example and “x” is a bad example.

—-

o https://abcdefgh.biz/sitemap_pt_postaaaa.xml
x https://abcdefgh.biz/sitemap-pt-postaaaa.xml
x https://abcdefgh.biz/sitemap-pt-post2021.xml
x https://abcdefgh .biz / sitemap-pt-post—-.xml
x https://abcdefgh.biz/sitemap-pt-post.xml

Underscores are ok, but hyphens are not ok.

 

o https://abcdefgh.biz/sitemap_pt_post.xml
o https://abcdefgh.biz/sitemap_pt_post2021.xml
x https://abcdefgh.biz/sitemap_pt_post20211.xml
x https://abcdefgh.biz/sitemap0pt0post20211.xml
x https //abcdefgh.biz/sitemap_pt_post20210.xml
x https://abcdefgh.biz/sitemap_pt_post202102.xml
x https://abcdefgh.biz/sitemap_pt_post20210200.xml

Long names are bad. The maximal length: the domain part is 21 characters long (“https://abcdefgh.biz/”), file name is 19 chars, and 4 chars for extension. 44 characters in total.

 

x https://abcdefgh.biz/sitemap_pt_post20211.xm
x https://abcdefgh.biz/sitemap_pt_post20211
o https://abcdefgh.biz/sitemap_pt_post2021

Extensions will not be counted. The file path will be limited to 40 characters in total except for the extension.

 

x https://abcdefgh.biz/sitemapaptapostaaaa.xml -> 19 chars
x https://abcdefgh.biz/12345678901234567890.xml -> 20
x https://abcdefgh.biz/1234567890123456789.xml -> 19
o https://abcdefgh.biz/123456789_123456789.xml -> 19

Even if the file name excluding the extension is 19 characters, it may not be good, but if you separate it with an underscore “_”, it will be ok.

 

x https://abcdefgh.biz/sitemapaptapostaa.xml -> 17
x https://abcdefgh.biz/sitemapaptapostaaa.xml -> 18
x https://abcdefgh.biz/sitemapapt-postaaaa.xml -> 19
o https://abcdefgh.biz/sitemapapt_postaaaa.xml -> 19

If you do not separate the file name with an underscore, even 17 characters file name is bad, but if you separate it with an underscore, 19 chars file name is ok. The hyphen delimiter is bad.

 

o https://abcdefgh.biz/sitemapaptp_ostaaaa.xml -> 19
o https://abcdefgh.biz/sitemapap_tpostaaaa.xml -> 19
x https://abcdefgh.biz/sitemapapt_postaaaaa.xml -> 20
x https://abcdefgh.biz/sitemap0pt0post2021.xml -> 19

Even if the file name is separated with an underscore, 20 characters filename is bad.

 

o https://abcdefgh.biz/_sitemapaptpostaaaa.xml -> 19
x https://abcdefgh.biz/sitemapaptpostaaaa_.xml ->19

Even if the delimiter position is located at the front, it is ok. If at the back end, it is bad.

 

x https://abcdefgh.biz/sitemapaptpostaaa_a.xml -> 19
x https://abcdefgh.biz/sitemapaptpostaa_aa.xml -> 19
o https://abcdefgh.biz/sitemapaptposta_aaa.xml -> 19
o https://abcdefgh.biz/sitemapaptpost_aaaa.xml -> 19

If _ is located before the 15th character (36th character from the beginning), it is ok. If it is located at 16th character or later, it is bad?

—-

So we can avoid these bad examples to make the Search Console read the sitemaps. I couldn’t assume the essential cause of this bug. I hope that Search Console engineers will use this information to fix bugs.

Workaround

The workaround is:

  • Do not put hyphens in the file name.
  • When the file name is long, separate it with an underscore.

If you’re having trouble with a similar issue, try this workaround.

Making readable sitemaps using Google XML sitemaps

If you are using Google XML sitemaps, you can avoid the “0” discovered URLs problem by modifying the plugin. The sub-sitemap names created by Google XML sitemaps are like this: “https://abcdefgh.biz/sitemap-pt-post-2021-01.xml”

 

If you adopt the above workaround, reduce the number of characters in the file name by replacing “sitemap” with “sm”, and replace the hyphen with underscore. For example, “https://abcdefgh.biz/sm_pt_post_2021_01.xml”. 

 

After modifying the plugin and registering the sitemap, the sitemap is now loaded successfully. It was read in a few minutes completely after being registered.

Summary

The “0” discovered URLs problem in the search console has been asked for many years on the community bulletin board of the search console. A common expert’s answer is that “it is due to the low site rank and slow crawl frequency, and the sitemap may be read after half a year.”, and is selected as a good answer and the thread is closed. However, at least in my environment, this problem has filename dependency, and assumedly is due to a bug in the Search Console. I don’t feel the bug will be fixed soon, because there are few people who have encountered the problem, but I hope that such information will be the trigger for the correction someday.

 

wordpress

関連記事

  • Search Consoleで検出されたURLが0になるバグを回避する方法。
    2021年2月27日
  • WordPressで、progressタグの前のbrタグが消えて改行しない時の対応方法。
    2018年11月2日
  • 軽量ソースコード ハイライター highlight.js を WordPressへ導入する方法。
    2017年1月20日
  • Googleクローラは、JavaScriptを理解するか?テスト結果を紹介!
    2017年1月10日
  • すぐできる!WordPressのドメインを同一サーバ内で移転する6ステップの方法
    2016年12月14日
  • 一行追加するだけ!Xeory Baseに Facebookシェアボタンを追加する方法。
    2016年12月14日
  • 一ヶ月使用してわかった WordPress 用 google公式 Adsenseプラグインの利点と欠点
    2016年11月28日
  • ブログ移転時必見!はてなブログ(無料)からリダイレクトをする簡単で効果的な方法!
    2016年11月15日
カテゴリー
  • コンピューター
    • gnuplot & eps
    • mac
    • matplotlib
    • wordpress
  • ホーム・家電
    • アイロン
    • オーディオ
    • オーラルケア
      • ジェットウォッシャー
      • 音波振動歯ブラシ
    • カメラ
    • カー用品
    • クリーナー
    • テレビ、レコーダー
    • ドアホン
    • メンズ美容家電
      • ラムダッシュ
    • ルンバ
    • 一覧比較
    • 工具
    • 浄水器
    • 温水洗浄便座
    • 炊飯器
    • 空気清浄機・加除湿機
    • 空調・季節家電
    • 美容家電
      • フェイスケア
      • ヘアケア
      • ボディーケア
    • 血圧計
    • 調理器具
    • 電子レンジ
  • 健康
  • 家事
    • パン
    • 料理
    • 育児
    • 食品
      • おせち
      • コーヒー
  • 書籍
  • 知識
  • 趣味
    • ペン字
    • ロードバイク・クロスバイク
    • 車
    • 鉄道模型
    • 電子工作
サイト内検索
最近の投稿
  • モナークHGとモナークM7の4つの違い。[ニコン双眼鏡]
  • ニコン モナーク M7とM5の4つの違い。[双眼鏡]
  • Speed Wi-Fi DOCK 5G 01の実機使用レビュー。
  • RZ-760、GT-760、ST-760の違い。[ガス缶・CB缶]
  • RZ-730、RZ-730S、RZ-730SBK、RZ-730SWH、ST-Y450の違い。
  • TC-EA35WとTC-E123SBKの違いを比較。[スティッククリーナー]
  1. ホーム
  2. コンピューター
  3. wordpress
  4. How to avoid the “0” discovered URLs bug in Search Console
  • ホーム
  • プライバシーポリシー

© カタログクリップ
contact@beiznotes.org

目次