I have a nextjs app, and I have a problem in which tens of thousands of pages are indexed by google and I don’t know how to remove them from indexing? I already submitted a new sitemap, but because those pages were previously indexed and cached, its like the new sitemap doesn’t matter. Those old pages are still being “hit” by the googlebot, which is eating up all my bandwidth.

Nexjs doesn’t let you send back a statuscode of 404 with dynamically generated SSG pages. So, even though these are “bad” pages, the statuscode is 200, which then tells googlebot its a good page, keep indexing……. (so, on these pages I throw back a 404 component, but it doesn’t change the status code. So the best I can do, is shoehorn in some meta tag).

The best I can do is get “another” meta tag on the page… can’t get rid of base ones.. so, will this prevent googlebot from reindexing?

my html now shows:

 <meta name="robots" content="index,follow"/> <meta name="googlebot" content="index,follow"/> <meta name="description" content="my website description"/> ....... bunch of other meta tags. <meta property="og:image:width" content="300"/> <meta property="og:image:height" content="250"/> <meta property="og:locale" content="en_US"/> <meta property="og:site_name" content="mySiteName"/> <link rel="canonical" href="websiteurl"/> <meta name="robots" content="noindex,nofollow"/> <meta name="googlebot" content="noindex,nofollow"/> 

so, the new meta tags are last.

submitted by /u/Jamesfromvenice
[link] [comments]