[ACCEPTED]-How to strip or escape html tags in Android-strip-tags

Accepted answer
Score: 244

The solutions in the answer linked to by 12 @sparkymat generally require either regex 11 - which is an error-prone approach - or 10 installing a third-party library such as 9 jsoup or jericho. A better solution on Android devices 8 is just to make use of the Html.fromHtml() function:

public String stripHtml(String html) {
    if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
       return Html.fromHtml(html, Html.FROM_HTML_MODE_LEGACY).toString();
    } else {
       return Html.fromHtml(html).toString();
    }
}

This 7 uses Android's built in Html parser to build 6 a Spanned representation of the input html without 5 any html tags. The "Span" markup 4 is then stripped by converting the output 3 back into a string.

As discussed here, Html.fromHtml 2 behaviour has changed since Android N. See 1 the documentation for more info.

Score: 15

Sorry for the late post, but i think this 6 might help for others,

To just remove the html strips

Html.fromHtml(htmltext).toString()

This way the html 5 tag will be replaced with string, but the 4 string willnot be formatted properly. Hence 3 i did

Html.fromHtml(htmltext).toString().replaceAll("\n", "").trim()

This way i first replace with nextline 2 with blankspace and removed blank space. Similarly 1 you can remove others.

Score: 12

You can alternatively use Html.escapeHtml(String) if you are targeting 5 API 16 or above.

For also targeting below 4 API 16, you can instead use the below class 3 by calling HtmlUtils.escapeHtml(String) which i simply pulled from the 2 source of Html.escapeHtml(String).

public class HtmlUtils {

    public static String escapeHtml(CharSequence text) {
        StringBuilder out = new StringBuilder();
        withinStyle(out, text, 0, text.length());
        return out.toString();
    }

    private static void withinStyle(StringBuilder out, CharSequence text,
                                    int start, int end) {
        for (int i = start; i < end; i++) {
            char c = text.charAt(i);

            if (c == '<') {
                out.append("&lt;");
            } else if (c == '>') {
                out.append("&gt;");
            } else if (c == '&') {
                out.append("&amp;");
            } else if (c >= 0xD800 && c <= 0xDFFF) {
                if (c < 0xDC00 && i + 1 < end) {
                    char d = text.charAt(i + 1);
                    if (d >= 0xDC00 && d <= 0xDFFF) {
                        i++;
                        int codepoint = 0x010000 | (int) c - 0xD800 << 10 | (int) d - 0xDC00;
                        out.append("&#").append(codepoint).append(";");
                    }
                }
            } else if (c > 0x7E || c < ' ') {
                out.append("&#").append((int) c).append(";");
            } else if (c == ' ') {
                while (i + 1 < end && text.charAt(i + 1) == ' ') {
                    out.append("&nbsp;");
                    i++;
                }

                out.append(' ');
            } else {
                out.append(c);
            }
        }
    }
}

I am using this class which works 1 fine.

Score: 4

This is for new method alternative (API 1 16+):

android.text.Html.escapeHtml(your_html).toString();
Score: 4

Html.fromHtml can be extremely slow for 6 large html strings.

Here's how you can do 5 it, easily and fast with jsoup:

Add this 4 line to your gradle file:

implementation 'org.jsoup:jsoup:1.11.3'

Check what is the 3 latest jsoup version here: https://jsoup.org/download

Add this line 2 to your code:

String text = Jsoup.parse(htmlStr).text();

Check this link here to learn 1 how to preserve line breaks:

How do I preserve line breaks when using jsoup to convert html to plain text?

Score: 2
 Spanned spanned;
        if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
            spanned = Html.fromHtml(textToShare, Html.FROM_HTML_MODE_LEGACY);
        } else {
            spanned = Html.fromHtml(textToShare);
        }
tv.setText(spanned.toString());

0

Score: 2

This is dead simple with jsoup

public static String html2text(String html) {
   return Jsoup.parse(html).text();
}

0

Score: 0

As it has not been mentioned yet, the way 7 to do this in a backwards compatible manner 6 would be to use the HtmlCompat utility class, and simply call (with 5 0 if you require no specific flags to be 4 used)

HtmlCompat.from(inputString, 0).toString()

Under the hood it already does all 3 the required api checks for you

if (Build.VERSION.SDK_INT >= 24) {
   return Html.fromHtml(source, flags);
}
return Html.fromHtml(source);

So for for 2 the input

<a href="https://www.stackoverflow.com">Click me!</a>

you will receive only the string 1 'Click me!' as output.

More Related questions