24.2 Typical HTML structure

  • HTML has hierarchical structure.
  • Structure is composed of elements.
  • Each element has
    • Start tag
    • Attributes
    • Content
    • End tag
  • Each element can have children which are themselves elements.
  • Consistency of structure enables one to do web scraping.