从字符串Ruby on Rails中删除html


121

我正在与Ruby on Rails一起使用,是否有一种方法可以html使用sanitize或equal方法从字符串中剥离,并仅在输入标签的value属性内保留文本?


不消毒或等同但text.strip有效
Keon

Answers:



183

如果我们想在模型中使用它

ActionView::Base.full_sanitizer.sanitize(html_string)

这是“ strip_tags”方法中的代码


31
这是可行的,但是从mdoel引用ActionView很尴尬。您可以更加干净地使用require 'html/sanitizer'实例化自己的消毒剂HTML::FullSanitizer.new
Nik Haldimann

8
@nhaldimann,require 'html/sanitizer'引发错误,所以我必须使用:Rails::Html::FullSanitizer.newedgeapi.rubyonrails.org/classes/HTML/…
Linh Dam


24
ActionView::Base.full_sanitizer.sanitize(html_string)

标签和属性的白名单可以指定为以下

ActionView::Base.full_sanitizer.sanitize(html_string, :tags => %w(img br p), :attributes => %w(src style))

以上声明允许标签imgbrp以及属性srcstyle


9

我使用了丝瓜络库,因为它适合HTML和XML(文档和字符串片段)。它是html消毒剂gem背后的引擎。我只是粘贴代码示例以说明它的使用非常简单。

丝瓜宝石

unsafe_html = "ohai! <div>div is safe</div> <script>but script is not</script>"

doc = Loofah.fragment(unsafe_html).scrub!(:strip)
doc.to_s    # => "ohai! <div>div is safe</div> "
doc.text    # => "ohai! div is safe "

1

这个怎么样?

white_list_sanitizer = Rails::Html::WhiteListSanitizer.new
WHITELIST = ['p','b','h1','h2','h3','h4','h5','h6','li','ul','ol','small','i','u']


[Your, Models, Here].each do |klass| 
  klass.all.each do |ob| 
    klass.attribute_names.each do |attrs|
      if ob.send(attrs).is_a? String
        ob.send("#{attrs}=", white_list_sanitizer.sanitize(ob.send(attrs), tags: WHITELIST, attributes: %w(id style)).gsub(/<p>\s*<\/p>\r\n/im, ''))
        ob.save
      end
    end
  end
end

还有Rails::Html::FullSanitizer.new,如果你不希望指定白名单。
Fredrik
By using our site, you acknowledge that you have read and understand our Cookie Policy and Privacy Policy.
Licensed under cc by-sa 3.0 with attribution required.