Google certainly sees duplicate content within the site as a serious issue. Google notes that websites change and over a period of time content comes and goes. A number of people deal with the website and all these bring in redundant content and redundant URLs. Your website loses its content integrity over a period of time. You will not be penalized for having duplicate content within your website. But this does not mean that you are totally free from problems. As of now Google does not have any technology that will allows the search engine bots to recognize duplicate content within the website. So this will ultimately affect your website in terms of page rank distribution. Your page rank will be distributed among these multiple pages that are redundant.
This is what Google suggests web masters in terms of dealing with duplicate content.
First you need to identify the pages that are with duplicate content. In order to do this you need to use the site: query in Google. This will list all the duplicate content in your website. Secondly, you must choose your preferred URLs and tell Google your URL hierarchy. It is not enough to choose your preferred URLs, but you must make sure to use only your preferred URLs whenever you need to referred to those pages. You should use the same preferred URLs even in the sitemaps. Google insists that you be consistent all through. Inconsistency will not only confuse the search engines, it will also confuse the users who are trying to find some information in your website.
To over come such issues, you need to now make use of permanent redirects using 301 redirects on pages that are duplicated. These pages must be redirected to the required preferred pages. This will ensure that your website visitors are always on the right pages despite the URL they use to access a particular page that has been duplicated.
It may not be possible always to use 301 in all situations in such situations Google recommends webmasters to make use of the rel=”canonical” link element. This feature is also supported by Bing, Yahoo and Ask.com.
Some webmasters try to block access to duplicate pages using the Robots.txt file. However, Google suggests not to make use of this method to keep the search engine bots from crawling the duplicate pages. It rather prefers that the webmasters use the rel=”canonical” link element. If you completely block access to those duplicate pages, Google will consider these pages to be separate pages and the best solution is to allow them to crawl but to redirect it to the actual pages using any one of the methods discussed above except using robots.txt.